Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Playing with Kernel TLS in Linux 4.13 and Go (filippo.io)
130 points by FiloSottile on Sept 6, 2017 | hide | past | favorite | 16 comments


That kernel patch sets off some alarms. Dozens of goto statements and many of them are not the common C exit/cleanup idiom; backwards jumps are frequent and following some of these functions is difficult. tls_sw_sendmsg and tls_sw_sendpage have 8 and 5 goto labels respectively.

Ew.

Might all be brilliant and flawless, but it's not obvious.


As someone who has contributed a bit to the Linux kernel, I agree with you that the use of goto in these two functions is... unusual. They do seem to fall into a few patterns, however.

On a first look at the first function (tls_sw_sendmsg), we appear to have:

- send_end is the normal cleanup target at the end of the function, that's common and found almost everywhere;

- alloc_encrypted, alloc_plaintext, push_record seem to be "retry" labels, where the code goes back and tries again;

- fallback_to_reg_send is the "then" case of an if, could be removed by inverting the sense of the test;

- wait_for_sndbuf, wait_for_memory are in the style of a cleanup target, but they end by jumping to one of the "retry" labels above;

- trim_sgl is... well, it seems an attempt to avoid code duplication, or something like that, the "goto trim_sgl;" could be replaced by "trim_both_sgl(sk, orig_size); goto send_end;"

Yeah, this code looks more like assembly than C. That appears to be the kind of hand-optimized code usually reserved for hot functions, where the "normal" case encounters as few branches as possible; the fallback_to_reg_send case points to that (the normal case would be the one without the goto). So the way to understand the control flow of these functions would be: ignore the gotos on first reading, since that would be the normal case, and after that look at each goto and understand what it does.


Linux kernel has a lot of goto in its code.


Other than jump to exit stuff?


Yeah, the "goto retry" (jump back to the beginning of the function or the loop) is also common.


Still pretty structured usage.


Solaris had kssl doing much more of this 7-8 years ago. In their case SPARC T1+ CPUs had hardware support to accelerate crypto ops but iirc kssl did not depend on it.

It also did not require applications to support/know about SSL/TLS. So after moving to Solaris 10 we were able to make some legacy apps use SSL/TLS without adding any code at all! Pretty cool stuff.

They did have a few vulnerabilities causing kernel panic that Snorcle had to fix - so in terms of adding more complexity to the kernel it's a risky approach but in our case it was totally worth it and it helped that the SSL/TLS traffic was all internal - nothing public facing.


> we were able to make some legacy apps use SSL/TLS without adding any code at all

How did you deal with certificate validation?


This[0] seems to answer your questions.

TL;DR: The kernel does it all and passes the unencrypted traffic to the local port specified. There's a command to configure it with the appropriate keys, etc.

[0] http://www.c0t0d0s0.org/archives/5575-Less-known-Solaris-Fea...


Sounds like nginx streams - http://nginx.org/en/docs/stream/ngx_stream_core_module.html . Really useful if you need TLS but the author of whatever you are running couldn't be bothered by adding support (includes being able to do client certificate auth!).


So, the same API semantics as IPSec transport mode, but not using IPSec. Seems like the best of both worlds, really.


Well the app didn't care - we had self signed certs (actually internal CA signed IIRC) that we had to feed to the kernel ssl module using the ksslconfig commands.

Edit: @Mister_Snuggles points to a link below if you wanted to run through the whole process.


So the question is, what's the result? Is it actually more performant? If not, have you identified where further work is needed to make it more performant?


I read something about 5% performance increase, relevant for big players like google, facebook who would save hundreds of thousands of dollars, but rather irrelevant for small companies.


He hasn't managed to make it work - it kernel panics for him.

He mentions that fb noticed significant performance improvement [1].

You should see improvements in some cases, ie. when you can do zero-copy transfers (avoiding kernel->user and then user->kernel data copying), in other words when you pipe data from one socket/file to the other socket/file.

[1] https://netdevconf.org/1.2/papers/ktls.pdf


No, he said:

> I ran a simple HTTPS web server with net/http, loaded a page on Chrome, and instead of causing a kernel panic...

Followed by demonstrating it working.

However, the point remains that he only got it working up to doing a toy hello world. The part that would be important for performance would be what he mentioned isn't finished, which is allowing it to be used with sendfile so that web servers can just sendfile over a TLS connection and let the kernel handle all of the IO.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: