Posted on November 4, 2019

Introducing tartiflette-plugin-scalars
This month I worked on a plugin for the Tartiflette GraphQL engine. It’s called tartiflette-plugin-scalars and it is here to help you validate data using scalar types in your GraphQL schema.

If you’ve never heard of Tartiflette before, it’s Dailymotion’s in-house GraphQL engine in Python 3.6. It aims at offering an SDL-first schema definition, allowing developers to focus on business-critical code. Tartiflette makes use of asyncio and can be used over aiohttp or ASGI.

Illustraton by Hayford Peirce

Tartiflette-plugin-scalars come with a list of common types such as DateTime, URL or IPv4. Using those types in your GraphQL schema will guarantee that the data you receive, as well as the data you send back, is in the correct format. When possible, the plugin will automatically serialize and unserialize the types you use toward Python3 types, like datetime.datetime.

If you want to start using it right now, you can install it through pip:

pip install tartiflette && pip install tartiflette-plugin-scalars

Then initialize the engine with create_engine, indicating you want to use tartiflette_plugin_scalars and an SDL file.
from tartiflette import create_engine with open("schema.sdl") as schema: engine = await create_engine( schema.read(), modules=[ { "name": "tartiflette_plugin_scalars", "config": {}, } ], )
Then you can start using the custom scalars to define your SDL. In the following example, we have a query that can return an IPv4 address with a port and a mutation that accepts a GUID as input and returns it:
type Query { ipAddress: IPv4 port: Port } type Mutation { checkGUID(input: GUID!): GUID! }
Write your resolvers as simple functions using the @Resolver decorator:
from tartiflette import Resolver @Resolver("Query.ipAddress") async def resolve_ip_address(parent, args, ctx, info): return ip_address("127.0.0.1") @Resolver("Query.port") async def resolve_port(parent, args, ctx, info): return 8080 @Resolver("Mutation.checkGUID") async def resolve_guid(parent, args, ctx, info): return args["input"]
Your program will be ready to accept requests. Tartiflette’s type system will automatically validate the data you resolve in queries and mutations:
>>> await engine.execute("query ip { ipAddress port }") {'data': {'ipAddress': '127.0.0.1', 'port': 8080}} >>> await engine.execute("mutation GUID { checkGUID(input:\"1df282eb-a458-4763-a3ca-619c320c5a3e\") }") {'data': {'checkGUID': '1df282eb-a458-4763-a3ca-619c320c5a3e'}}
If you want to use tartiflette with an HTTP server, the plugin tartiflette-aiohttp is there to help you do this. You can install it by running :
pip install tartiflette-aiohttp
And start your engine with aiohttp using register_graphql_handlers:
from aiohttp import web from tartiflette_aiohttp import register_graphql_handlers web.run_app( register_graphql_handlers( web.Application(), engine=engine, ) )
The API will be exposed by default on http://localhost:8080/graphql. You can try it with a simple query and HTTPie:
➜ http --json POST http://localhost:8080/graphql \ > query="query ip { ipAddress port }" HTTP/1.1 200 OK Content-Length: 50 Content-Type: application/json; charset=utf-8 Date: Thu, 31 Oct 2019 18:24:02 GMT Server: Python/3.7 aiohttp/3.6.2 { "data": { "ipAddress": "127.0.0.1", "port": 8080 } }
As you can see, Tartiflette makes getting started with GraphQL in Python a simple and straightforward process. The full code for this article’s example is available on GitHub as well as Tartiflette source code.
Posted on September 13, 2019

Low Tech Crypto : One-Time Pad
Have you ever wondered how you would secure your communication if an electromagnetic pulse destroyed all the computers around you? Do you distrust hardware manufacturers so much that you refuse to rely on anything you didn’t carve out of wood?

If you answered yes to any of these questions, this series of articles is for you, as it will explore the world of “low tech” cryptography. By this, I mean cryptographic tools that can be used without a computer while still preserving high-security standards.

As an introduction, I will talk about one of the most famous encryption technique (and the only one that’s theoretically uncrackable): One-Time Pads.

Illustraton by James Pond

A bit of history

The One-Time Pad is one of the oldest ciphers to be still considered secure in 2019, even though it was invented in 1882, by Frank Miller. At the time, it served to secure telegraphic communications between American banks.

It only gained popularity amongst the worldwide military and diplomatic corps after being re-invented by Gilbert S. Vernam in 1917. In 1945, Claude Shannon from Bell Labs proved that one-time pads were secure, even against an adversary with unlimited computing power.

Coins, cards and dices

The one-time pad has a simple requirement to work securely: a truly random key as long as the text you want to encrypt. True randomness is something that’s not doable for computers, as they are deterministic systems that can hardly generate anything unplanned by their programs. That’s why computer random number generators are usually considered pseudo-random.

The good thing is that even if randomness is hard for computers, it doesn’t have to be hard for us. There’s plenty of good sources of randomness around that can be used to generate a key. The only thing you have to keep in mind is to pick one that’s in the same base as the kind of text you want to encrypt:
- If you have binary-encoded data (or morse code) coin flips make for a perfect base 2 random number generator
- If you want to encrypt letters from the Latin alphabet (base 26), half a 52 cards deck can work
- If your plaintext is made of letters and numbers from the Latin alphabet (36 characters in total), you can throw 6 times a 6 sided dice
Illustraton by Ian Gonzalez

OTP in practice

Once you have generated a key as long as the text you want to encrypt, you can generate a ciphertext by adding the key to the plaintext, modulo the basis.

Let’s take the string “HELLOWORLD” as an example. If you convert it to number of base 26 (0 for A, 25 for Z), it becomes [7, 4, 11, 11, 14, 22, 14, 17, 11, 3] Now using half a pack or card, or two dice of ten and a dice of 6, we pick the following random numbers [0, 23, 25, 14, 6, 15, 22, 6, 9, 23]. The last step is to add the plaintext to the key, modulo 26. In python it would match the following code:
>>> plaintext = [7, 4, 11, 11, 14, 22, 14, 17, 11, 3] >>> key = [0, 23, 25, 14, 6, 15, 22, 6, 9, 23] >>> [(p + k) % 26 for p, k in zip(plaintext, key)] [7, 1, 10, 25, 20, 11, 10, 23, 20, 0]
If anyone tried to directly decode the given ciphertext obtained above, they would read “HBKZULKXUA”. If they have the key, they can do the opposite of the previous operation which is simply subtracting the ciphertext with the key, modulo 26.
>>> ciphertext = [7, 1, 10, 25, 20, 11, 10, 23, 20, 0] >>> key = [0, 23, 25, 14, 6, 15, 22, 6, 9, 23] >>> [(c - k) % 26 for c, k in zip(ciphertext, key)] [7, 4, 11, 11, 14, 22, 14, 17, 11, 3]
Limitations

Sadly, this algorithm has a big issue: since the key is as long as the plaintext, it creates a key distribution problem. You will need a secure channel over which you can send a key as long as the plaintext efficiently, which is usually exactly the kind of thing you lack if you need to use secure encryption.

The key also needs to be regenerated for every single message (or it stops being random and the cipher stops being secure). This can be a very tedious and long process, which makes one-time pad unsuitable for any lengthy message.

As I will show in my next articles, multiple cryptographers worked on solving those problems by creating more sophisticated ciphers that rely on fixed lengths key.
Posted on August 21, 2019

OpenR&Day 2019 recap

Last month I attended the 2019 edition of OpenR&Day. It was an entire day of conferences organized by Oodrive, in partnership with many french tech companies, like Dailymotion, Meetic, and BackMarket. The event took place at the Theâtre des Variétés at the center of Paris,

This is the year of Kubernetes

The main thing we can observe in this conference is that everyone in Paris wants to do Kubernetes today. People are migrating toward managed k8s on the cloud, k8s on-premise, putting their CI/CD on k8s and of course, doing data science in k8s.

The most informative talk on this subject was Theotime Leveque from BackMarket: Kubernetes migration a retrospective on a forced walk to heaven, in which he explains how they migrated their entire infrastructure to Kubernetes in barely a few weeks during summer. This talk explains extremely well the motivation and implications behind a move toward Kubernetes, and all the things that should be taken into account if you wanted to migrate an already large existing codebase.

The short talk Our journey from Jenkins to Jenkins X by Vincent Behar from Dailymotion is also an excellent introduction to the future of CI in the cloud.

Amazing guests

OpenR&Day hosted two amazing guests speaker this year.

The first one was the famous Jean-Baptiste Kempf from VideoLAN, who was here to talk to us about the life of his very large open-source project: VLC. The talk is named Feedback on the VLC community and will probably tell you a lot of things you didn’t know about this french open-source project.

The second speaker was Ayumi Moore Aoki from Women in Tech, here to initiate a discussion about feminism in the IT industry.

The Théâtre des Variétés hosting the event.

More and more GraphQL in France

Last but not least, GraphQL seems to continue having a deep impact on the french tech.

This year Stan Chollet from Dailymotion was here to show us the work that was made here on Tartiflette, our homemade GraphQL engine. His talk Find out how to build a GraphQL API simply with SDL, thanks to Tartiflette, explains the motivation behind the project and what it can bring to your API.

Posted on June 23, 2019

The dangers of AES-CBC

Like many block ciphers, AES (Advanced Encryption Standard aka Rijndael) comes with plenty of different modes, all labeled with confusing 3 letters names like ECB, CBC, CTR or CFB. Many developers are told that they shouldn’t use ECB (Electronic Code Book) because it doesn’t provide strong data confidentiality. However, a lot of people will assume that the very popular CBC (Cipher Block Chaining) mode is perfectly fit for all use-cases. Sadly, this is not true, because while providing very good data confidentiality, CBC does not guarantee data integrity.

What is this CBC thing about?

The first thing you need to know before understanding what’s CBC is what’s a block cipher. A block cipher is a function that will take a block of plaintext (the human-readable input) of length n and a key, and use that to produce a block of ciphertext (the encrypted gibberish) of length n. AES is the most popular block cipher around right now, as it is recommended by both NIST and NSA, it operates on 128 bits blocks with keys of 128, 192 or 256 bits.

The problem here is that a function meant to take inputs of 128 bits isn’t going to encrypt a large amount of data in a single call. When confronted with that problem, the intuitive solution is to simply divide your data into multiple 128 bits blocks and simply just call AES with the same key on each one of them.

This method is called Electronic Code Book and as I mentioned earlier, is unsafe because data patterns might remain and serve as a basis for analysis. CBC aims to solve this by adding randomness to each call to the block cipher by applying the exclusive-or operation (XOR) to each plaintext block with the previously generated ciphertext block (or a random Initialization Vector (IV), for the first block). Decryption works by doing the process in reverse and XORing each generated plaintext with the previous ciphertext.

Introducing Bit Flipping attacks

It’s the process of XORing plaintext blocks with the previous ciphertext block during decryption that will introduce a data integrity vulnerability. If we take a look at the XOR truth table, we can see that switching one bit of one of the ciphertext will change the output from 0 to 1, or 1 to 0 :

Ciphertext	Plaintext	Output
0	0	0
1	0	1
0	1	1
1	1	0

What this table is telling us, is that if we switch one bit from the previous block of ciphertext, it will be switched in the next deciphered block too.

Of course, that could cause the previous block of plaintext to be replaced by unpredictable garbage data. Some people could assume that if the ciphertext gets modified, the plaintext will not be anything readable and the application will simply return an error. This is a very dangerous assumption because there are many cases in which some parts of the data might get scrambled yet not trigger any error which could warn the user of an issue and the modified message will affect the system in the way the attacker wants.

Keep in mind that, it’s also possible to alter the first block of data by simply changing the IV, which won’t cause any unpredictable plaintext changes (since the IV is discarded after decryption). This is especially dangerous as sometimes your ciphertext is just a single block.

An exploitation scenario

Let’s say Alice wants to send 100$ to Bob through an encrypted banking service. She will first encrypt the ASCII-encoded string “Send 100$ to Bob” with AES-256 and send this order to her banker. This can be done using a simple function like this one (in Go):

func encrypt(plaintext, key []byte) {
	// generate unique IV
	iv := make([]byte, aes.BlockSize)
	io.ReadFull(rand.Reader, iv)

	// instantiate block cipher
	block, _ := aes.NewCipher(key)
	mode := cipher.NewCBCEncrypter(block, iv)

	// encrypt
	ciphertext := make([]byte, len(plaintext))
	mode.CryptBlocks(ciphertext, plaintext)

	fmt.Printf("iv: %x\n", iv)
	fmt.Printf("ciphertext: %x\n", ciphertext)
}

Alice will send by over an insecure channel the following IV and ciphertext:

iv: 9bc423909ac569b5016525cb4b2660b5
ciphertext: c6d55918176051c5a603d62cdf23fa8a

Sadly, this message will get intercepted by Eve. As Eve uses the same banking system as Alice and knows Alice is a good friend of Bob, she can guess the message will be shaped Send xxx$ to Bob. Eve will then simply compute the XOR of Send xxx$ to Bob and Send xxx$ to Eve to know how to change the ciphertext (she doesn’t need to know the actual amount of money being sent). This can be done using this simple function:

func xor(plaintext, malicious []byte) {
	output := make([]byte, len(plaintext))
	for i := 0; i < len(plaintext); i++ {
		output[i] = plaintext[i] ^ malicious[i]
	}

	fmt.Printf("%x\n", output)
}

Which will output:

00000000000000000000000000071907

Since XOR is an associative operation, she can just XOR the previous output with the IV to flip the correct bits. This way, Eve will send back to the banker the altered IV and ciphertext:

iv: 9bc423909ac569b5016525cb4b2179b2
ciphertext: c6d55918176051c5a603d62cdf23fa8a

When the banker decodes them through the decryption function below:

func decrypt(iv, ciphertext, key []byte) {
	// instantiate block cipher
	block, _ := aes.NewCipher(key)
	mode := cipher.NewCBCDecrypter(block, iv)

	// decrypt
	plaintext := make([]byte, len(ciphertext))
	mode.CryptBlocks(plaintext, ciphertext)
	fmt.Printf("plaintext: %s\n", plaintext)
}

It will output:

plaintext: Send 100$ to Eve

What to use instead of CBC?

The safest way to avoid this pitfall is to use authenticated encryption, which ensures data integrity as well as confidentiality. Galois/Counter Mode (GCM) is a popular alternative to CBC that provides authenticated encryption with block ciphers like AES. If you really have no choice and need to use CBC, you can still secure it by computing a message authentication code (MAC) from the ciphertext and IV, this can be done using the popular HMAC algorithm. Although, like with many crypto related topics, message authentication is a tricky subject and would easily require another entire blog post to cover it properly.

Sources and more

Source code for the vulnerable application

Source code for the exploit

Bit Flipping Attack on CBC Mode - Cryptography Stack Exchange

Block cipher mode of operation - Wikipedia

Cryptography I - Coursera

Posted on March 9, 2019

BDD in Golang
Behaviour-Driven Development (BDD) is, in my opinion, one of the best development practices to tackle projects with complex business logic. BDD is meant to help communication between technical and non-technical members of the team, by creating a common (natural) language for specifications and development. It is especially useful when trying to implement a Domain-Driven methodology, as it will help the developers share a common understanding of the business logic with the clients.

As a Go developer, I was happy to find that BDD testing was very easy to implement thanks to the godog package, and I will try to show you how it can be integrated into your tests.

A bank application

We will test a very simple bank application that allows the user to deposit and withdraw money. The business logic is already written in this simple struct:
// account.go package bank type account struct { balance int } func (a *account) withdraw(amount int) { a.balance = a.balance - amount } func (a *account) deposit(amount int) { a.balance = a.balance + amount }
Writing the specification

This file will match a specification file stored in the features folder. The specifications are written in Gherkin. Gherkin is a specifications language based on natural languages. It uses keywords that developers will be able to match against their code such as: Given, When, Then…

Note: Gherkin is available in many natural languages, make sure to always use one all the members of your team speaks fluently

In this file we will describe two scenarios, one for deposits and one for withdrawals:
#file: features/account.feature Feature: bank account A user's bank account must be able to withdraw and deposit cash Scenario: Deposit Given I have a bank account with 10$ When I deposit 10$ Then it should have a balance of 20$ Scenario: Withdrawal Given I have a bank account with 20$ When I withdraw 10$ Then it should have a balance of 10$
Writing the test

The godog library allows us to run BDD tests

The sentences in the account.feature file will then need to be linked to runnable test code. Running the godog command can automatically suggest the structure for your test file. This command can be installed with go get github.com/DATA-DOG/godog/cmd/godog.

This test file will contain one function for each of the steps defined in the scenario, as well as a FeatureContext command that will link the Go functions to natural languages sentences using regex, and define the setup/cleanup operations:
package bank import ( "fmt" "github.com/DATA-DOG/godog" ) var testAccount *account func iHaveABankAccountWith(balance int) error { testAccount = &account{balance:balance} return nil } func iDeposit(amount int) error { testAccount.deposit(amount) return nil } func iWithdraw(amount int) error { testAccount.withdraw(amount) return nil } func itShouldHaveABalanceOf(balance int) error { if testAccount.balance == balance { return nil } return fmt.Errorf("Incorrect account balance") } func FeatureContext(s *godog.Suite) { s.Step(`^I have a bank account with (\d+)\$$`, iHaveABankAccountWith) s.Step(`^I deposit (\d+)\$$`, iDeposit) s.Step(`^I withdraw (\d+)\$$`, iWithdraw) s.Step(`^it should have a balance of (\d+)\$$`, itShouldHaveABalanceOf) s.BeforeScenario(func(interface{}) { testAccount = nil }) }
Launching the godog command will result in the test scenarios being run (and normally, everything should be green 😉). You can also launch all the tests using go test by modifying your TestMain.

Using scenario outlines

Just like table driven tests is a common way to write tests in Go, scenario outlines will allow you to run the same steps on a larger dataset. This will require transforming each scenario in our feature file to be transformed into a feature file and providing test data in Examples sections:
Feature: bank account A user's bank account must be able to withdraw and deposit cash Scenario Outline: Deposit Given I have a bank account with <start>$ When I deposit <deposit>$ Then it should have a balance of <end>$ Examples: | start | deposit | end | | 10 | 0 | 10 | | 10 | 10 | 20 | | 100 | 50 | 150 | Scenario Outline: Withdrawal Given I have a bank account with <start>$ When I withdraw <withdrawal>$ Then it should have a balance of <end>$ Examples: | start | withdrawal | end | | 10 | 0 | 10 | | 20 | 10 | 10 | | 100 | 50 | 50 |
This time running godog will execute 6 scenarios and 18 steps.

Conclusion

Obviously, BDD won’t be useful for every kind of application. But I know it can help many teams easily solve some complex business problem. As usual, mastering the tools like Gherkin is not enough to take all the benefits from BDD, and to do that you should also learn about practices such as Test-Driven Development (its precursor) and Domain-Driven Design.

You can find the full code for this article on GitHub gists.

Older Newer

A bit of history

Coins, cards and dices

OTP in practice

Limitations

This is the year of Kubernetes

Amazing guests

More and more GraphQL in France

What is this CBC thing about?

Introducing Bit Flipping attacks

An exploitation scenario

What to use instead of CBC?

Sources and more

A bank application

Writing the specification

Writing the test

Using scenario outlines

Conclusion