> *I've gained a huge knowledge of low level programming* This is really interes...

_euvw · on Nov 21, 2018

Not realtime, because that only enforces 'precision', not low latency per se.

When I was working in this field, 2008-2011, there were guys doing fpgas, custom tcp/ip stacks, custom network drivers, dedicated networks and network cards for exchange data coming in and for going out. Mostly linux.

Hardware and lowlevel fun.

Allthough the fastest trades were always done by this one catalonian guy using Windows and .NET. I kid you not.

Good times. Soulless. But good.

exikyut · on Nov 21, 2018

How does realtime enforce precision and not latency? I was referring to hard realtime.

And wow, so I wasn't too far off the mark. FPGAs and exotic networking. Huh.

I remember reading a story about a trading floor running on SQL Server, which was doing continuous throughput of 6000 queries/second. I didn't know enough at the time to discern what percentage of that was writes, but I think the point may have been that it was all of it. This was quite a few years ago. So perhaps Windows isn't actually the slowe{st,r} system out there for certain tasks.

_euvw · on Nov 21, 2018

As I've always understood (but I'm no RTOS expert) is that RTOS does not guarantee LOWER latency. It guarantees A latency.

But again: not an RTOS expert. We had a lab that would constantly test configurations of hardware and software. And I remember them finding RTOS not being helpful.

speleo_engr · on Nov 21, 2018

That's right, real-time does not mean real-fast. In a hard real-time system, there is a deterministic worst-case bound for response times. "Real fast" CPUs, like the latest and greatest Intel CPUs, are actually pretty difficult to get deterministic bounds on. There are factors like unpreventable SMI events, possibility of L1/L2/L3 cache misses, etc. Often systems that need to be really deterministic, like say an engine controller in a car, run on simple CPUs like the Cortex-R series from ARM.

exikyut · on Nov 22, 2018

> "Real fast" CPUs, like the latest and greatest Intel CPUs, are actually pretty difficult to get deterministic bounds on. There are factors like unpreventable SMI events, possibility of L1/L2/L3 cache misses, etc.

Oh yeah. I remember reading something along the same lines about x86 a while back. I guess it didn't really go in properly, heh. Thanks

I'm reminded of the "x86 is high level" thing: https://news.ycombinator.com/item?id=9264195

Also, I think the iPhone 6's NVMe apparently uses a Cortex-R: https://ramtin-amin.fr/#nvmepcie

lordnacho · on Nov 21, 2018

I'd say AVX512 is maybe not so great, because it can cook your CPU to the point where it slows down the clock. AVX2 probably required. But above all test. Have a bunch of compilers, read about all the options, see what is fastest.

FPGA feed handlers are common, but now that can also be rented.

Whether you're using GPUs depends on what you're up to. A lot of the strategy testing requires a bunch of computing power but not speed. You then take your conclusions and implement something fast that doesn't necessarily use the GPU.

Realtime, but soft real time. It's not like a vehicle ABS system where you have to brake within x milliseconds or someone gets killed. I've seen places where they see the degradation over time and eventually decide it's time for the newest hardware, again.

exikyut · on Nov 21, 2018

Ah, I see. That reminds me of https://stackoverflow.com/questions/8389648/ (7 years ago, just normal AVX).

I also just found http://redd.it/8dhp7q asking about AVX512 slowdowns too.

TIL about FPGA feed handlers. (http://redd.it/56tw4n, one of the first hits for the term, was mildly interesting)

Hmm, good point about not needing speed. Yeah, 24 execution units each capable of 3 billion ops/sec is probably more performance than is needed :)

Interestingly, I would have imagine HFT as needing ABS-style hard realtime. But no, it neither needs that nor is simple enough to be encapsulated by that sort of embedded-style approach.