|
| 1 | +--- |
| 2 | +title: NASDAQ Orders in Real-Time with Grafana |
| 3 | +linkTitle: "NASDAQ Orders in Grafana" |
| 4 | +--- |
| 5 | + |
| 6 | +In this chapter, we will build a real-time stock market monitoring solution for the whole NASDAQ stock exchange. |
| 7 | + |
| 8 | +You will learn how you can use CedarDB as central engine to |
| 9 | + - process tens of thousands of events per second while handling a complex analytical workload, |
| 10 | + - gain real-time insights with an off-the-shelf [Grafana](https://grafana.com/) dashboard, |
| 11 | + - ingest data from any application with a Postgres-compatible connector, |
| 12 | + - do all of the above on inexpensive hardware. |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | +Since a picture is worth more then a thousand words, here is what we are building: |
| 17 | + |
| 18 | + |
| 19 | + |
| 20 | + |
| 21 | +## Just get me started ASAP |
| 22 | + |
| 23 | +```shell |
| 24 | +git clone git@github.com:cedardb/examples.git |
| 25 | +cd examples/nasdaq |
| 26 | +./prepare.sh |
| 27 | +docker compose build client |
| 28 | +docker compose up |
| 29 | +google-chrome localhost:3000 |
| 30 | +``` |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | + |
| 35 | +## Goals and Requirements |
| 36 | + |
| 37 | +We have the following requirements for our NASDAQ real-time trading dashboard: |
| 38 | + - **We want to display the current market state as it happens.** Our dashboard therefore must be able to ingest and process all order book changes as they come in - up to a hundred thousand events per second. |
| 39 | + - **We want to extract all the information we can glean from the data.** To gain an advantage over our competition, we want our insights to be informed by the full data set. We expecially don't want to throw away or pre-aggregate data. We should at least be able to calculate the complete current order book for any instrument traded on NASDAQ in real-time. |
| 40 | + - **We want to focus on the business logic, not the software stack.** We want to use Grafana as an off-the-shelf dashboarding solution. We want to visualize the price history, order book depth, and other metrics without worrying about interfacing with the database. |
| 41 | + - **Cost and ease of use.** We want to do all of this on modest, cheap and easily available hardware, like a reasonably modern Laptop. We definitely don't want to depend on exotic hardware or having to rent expensive beefy cloud instances. |
| 42 | + |
| 43 | +## Dataset |
| 44 | + |
| 45 | +We will use NASDAQ's [dump of real-time orders](https://emi.nasdaq.com/ITCH/Nasdaq%20ITCH/) as data source. |
| 46 | +For an overview of the dataset and some queries to get you started, take a look at the [NASDAQ dataset page](/docs/example_datasets/nasdaq). |
| 47 | + |
| 48 | + |
| 49 | +## Architecture |
| 50 | + |
| 51 | +Let's sketch out the architecture of our stock market monitor. |
| 52 | + |
| 53 | + |
| 54 | + |
| 55 | +* **A thin *Stock Client* accepts the external Data Stream from NASDAQ and transforms it into SQL statements for CedarDB.** |
| 56 | +In a real-world deployment this might be an adapter around the UDP connection to NASDAQ serving the stock exchange events or your existing monitoring solution. |
| 57 | +Since we don't have access to the real-time data, we have to make do with the dump of one day's worth of events. We built a C++ client that does a live replay in real-time, mocking the NASDAQ exchange. |
| 58 | +* **A Grafana dashboard serves as front end to the users.** |
| 59 | +It will display the state of the stock exchange by issuing queries against CedarDB using Grafana's Postgres data source connector. |
| 60 | +* **CedarDB does the heavy lifting.** It serves as persistent data store for the exchange data and query engine driving the Grafana visualization all by itself. |
| 61 | + |
| 62 | +## Setting up CedarDB |
| 63 | +Lets first focus on how we can set up CedarDB to store and process all our data. |
| 64 | +Afterwards, we can think about how to best get data into and out of CedarDB. |
| 65 | + |
| 66 | + |
| 67 | +### The Schema |
| 68 | +We will use the following schema to store the state of the exchange: |
| 69 | + |
| 70 | +```sql {filename="schema.sql"} |
| 71 | +begin; |
| 72 | +drop table if exists orderbook, executions, cancelations, orders, marketMakers, stocks; |
| 73 | + |
| 74 | +create table stocks |
| 75 | +( |
| 76 | + stockId int primary key, |
| 77 | + name text unique, |
| 78 | + marketCategory text, |
| 79 | + financialStatusIndicator text, |
| 80 | + roundLotSize int, |
| 81 | + roundLotsOnly bool, |
| 82 | + issueClassification text, |
| 83 | + issueSubType text, |
| 84 | + authenticity text, |
| 85 | + shortSaleThresholdIndicator bool, |
| 86 | + IPOFlag bool, |
| 87 | + LULDReferencePriceTier text, |
| 88 | + ETPFlag bool, |
| 89 | + ETPLeverageFactor int, |
| 90 | + InverseIndicator bool |
| 91 | +); |
| 92 | + |
| 93 | +create table marketmakers |
| 94 | +( |
| 95 | + timestamp bigint, |
| 96 | + stockId int, |
| 97 | + name text, |
| 98 | + isPrimary bool, |
| 99 | + mode text, |
| 100 | + state text |
| 101 | +); |
| 102 | + |
| 103 | +create table orders |
| 104 | +( |
| 105 | + stockId int not null, |
| 106 | + timestamp bigint not null, |
| 107 | + orderId bigint primary key not null, |
| 108 | + side text, |
| 109 | + quantity int not null, |
| 110 | + price numeric(10,4) not null, |
| 111 | + attribution text, |
| 112 | + prevOrder bigint |
| 113 | +); |
| 114 | + |
| 115 | +create table executions |
| 116 | +( |
| 117 | + timestamp bigint not null, |
| 118 | + orderId bigint, |
| 119 | + stockId int not null, |
| 120 | + quantity int not null, |
| 121 | + price numeric(10,4), |
| 122 | +); |
| 123 | + |
| 124 | +create table cancelations |
| 125 | +( |
| 126 | + timestamp bigint not null, |
| 127 | + orderId bigint not null, |
| 128 | + stockId int not null, |
| 129 | + quantity int |
| 130 | +); |
| 131 | + |
| 132 | +create table orderbook |
| 133 | +( |
| 134 | + orderId bigint, |
| 135 | + stockId int, |
| 136 | + side text, |
| 137 | + price numeric(10,4), |
| 138 | + quantity int, |
| 139 | + primary key(orderid, price) |
| 140 | +); |
| 141 | +commit; |
| 142 | + |
| 143 | +begin bulk write; |
| 144 | +create index on orderbook(orderId); |
| 145 | +create index on cancelations(timestamp); |
| 146 | +create index on executions(timestamp); |
| 147 | +create index on orders(timestamp); |
| 148 | +commit; |
| 149 | +``` |
| 150 | + |
| 151 | +The tables `stocks` and `marketmakers` are static. |
| 152 | +We will populate them once at startup and use them to show user readable info (i.e., displaying a stock's name, not just its id). |
| 153 | + |
| 154 | + |
| 155 | +The tables `orders`, `executions`, and `cancelations` are append only and store a complete history of the exchange. |
| 156 | +They act as source of truth and also allow us to recompute the state of the exchange at any point in time. |
| 157 | + |
| 158 | +{{< callout type="info" >}} |
| 159 | +For a more in depth explanation of these tables, take a look at the [NASDAQ example dataset page](/docs/example_datasets/nasdaq). |
| 160 | +{{< /callout >}} |
| 161 | + |
| 162 | +Since we are usually interested in the *current* state of the exchange, it might be a good idea to optimize for that. |
| 163 | +To this end, we are adding another table called *orderbook* which keeps track of which orders are currently active. |
| 164 | + |
| 165 | +With this setup, the following four sets of DDL statements are enough to keep track of the exchange state. |
| 166 | + |
| 167 | +### Adding a new order |
| 168 | + |
| 169 | +```sql |
| 170 | +begin; |
| 171 | +-- 'orderID' wants to BUY/SELL ('side') 'quantity' amount of 'stockId' at 'price'. |
| 172 | +-- Attribution optionally refers to a marketmaker. |
| 173 | +-- prevOrder is NULL as we insert a new order which doesn't replace a previous order. |
| 174 | +insert into orders values(stockId, timestamp, orderId, side, quantity, price, attribution, null); |
| 175 | +-- Add the new order to the order book as well. |
| 176 | +insert into orderbook values(orderId, stockId, side, quantity, price); |
| 177 | +commit; |
| 178 | +``` |
| 179 | + |
| 180 | +### Updating an existing order |
| 181 | + |
| 182 | +```sql |
| 183 | +begin; |
| 184 | +-- Append the new order and mark it as superseding the previous order |
| 185 | +insert into orders values((...), prevOrder); |
| 186 | +-- Remove the now obsolete old order from the order book |
| 187 | +delete from orderbook where orderId = prevOrder; |
| 188 | +-- Add the updated order to the orderbook |
| 189 | +insert into orderbook values((...)); |
| 190 | +commit; |
| 191 | +``` |
| 192 | + |
| 193 | +### Executing an order |
| 194 | + |
| 195 | +```sql |
| 196 | +begin; |
| 197 | +-- 'executedQuantity' of 'stockId' from 'executedOrderId' have exchanged hands. |
| 198 | +-- Almost all execution events don't have a price (last 'NULL') |
| 199 | +-- since the price is already specified in the executed order |
| 200 | +insert into executions values(timestamp, executedOrderId, stockId, executedQuantity, null); |
| 201 | +-- Reduce amount in order book accordingly. |
| 202 | +-- There still might be some shares left as orders can be fulfilled partially. |
| 203 | +update orderbook set quantity = quantity - executedQuantity where orderId = executedOrderId; |
| 204 | +commit; |
| 205 | +``` |
| 206 | + |
| 207 | +### Canceling an order |
| 208 | + |
| 209 | +```sql |
| 210 | +begin; |
| 211 | +insert into cancelations values(timestamp, canceledOrderId, stockId, canceledQuantity); |
| 212 | +-- If canceledQuantity == 0, i.e. full cancelation, we can remove the order from the book: |
| 213 | +delete from orderbook where orderid = canceledOrderId and canceledQuantity = 0; |
| 214 | +-- Else it's a partial cancelation, update order book accordingly: |
| 215 | +update orderbook set quantity = quantity - canceledQuantity |
| 216 | + where orderId = canceledOrderId and canceledQuantity != 0; |
| 217 | +commit; |
| 218 | +``` |
| 219 | + |
| 220 | +## The Client |
| 221 | + |
| 222 | +We have written a [simple C++ client](https://github.com/cedardb/examples/tree/main/nasdaq/client) to feed CedarDB with the parsed exchange events. |
| 223 | +Its job is to parse incoming events and issue the corresponding SQL statements introduced above. |
| 224 | + |
| 225 | +The client uses Postgres's `libpq` to talk to CedarDB. It leverages [pipeline mode](https://www.postgresql.org/docs/current/libpq-pipeline-mode.html) for better throughput, but is otherwise very straightforward. |
| 226 | +The application logic fits into 100 lines of code, most of the other 350 lines are required for CSV parsing. |
| 227 | + |
| 228 | +While the client is running, it replays the live exchange data in 100 millisecond batches, treating the point in time the program was started as 9:30 AM, i.e. the exact instance the market opens. |
| 229 | +So, if the client runs for 30 minutes, the database state will represent the state of the NASDAQ exchange 30 minutes after market open, i.e., 10:00 AM. |
| 230 | + |
| 231 | + |
| 232 | +## Grafana |
| 233 | + |
| 234 | +We use the default Grafana docker container and establish connection to CedarDB with a [Postgres data source](https://github.com/cedardb/examples/blob/main/nasdaq/grafana/datasources/automatic.yml). |
| 235 | + |
| 236 | +The Grafana visualizations are driven by parametrized raw SQL queries containing the application logic. |
| 237 | +They are directly executed against CedarDB with no caching layer in between. |
| 238 | + |
| 239 | +Let's look at two exemplary queries! |
| 240 | + |
| 241 | +### Order book depth |
| 242 | + |
| 243 | +This visualization calculates the depth of the order book at all price points. |
| 244 | + |
| 245 | + |
| 246 | +It can be read as follows: |
| 247 | +There are zero open orders at about $323.50. |
| 248 | +Everything to the right of that are open sell orders. People would like to sell for more than the current price. |
| 249 | +There are, for example, around 60,000 outstanding orders that would like to sell at $330.00. |
| 250 | +Conversely, everything to the right are buy orders. About 30,000 orders would like to buy at $320.00. |
| 251 | +Seems like humans prefer nice round numbers! |
| 252 | + |
| 253 | +The chart is generated by the following SQL query: |
| 254 | +```sql |
| 255 | +with orderdepth as ( |
| 256 | + -- count the number of available stock at each price for the buy and for the sell side |
| 257 | + select price, side, sum(quantity) as quantity |
| 258 | + from orderbook o, stocks s |
| 259 | + where o.stockId = s.stockId |
| 260 | + and s.name = 'AAPL' -- stock ticker symbol we're interested in |
| 261 | + group by price, side |
| 262 | + having sum(quantity) > 0 |
| 263 | +), |
| 264 | +cumulative_sell as ( |
| 265 | + -- transform the sell side into a cumulative sum with a window query |
| 266 | + select price, side, sum(quantity) over (order by price asc) as sum |
| 267 | + from orderdepth |
| 268 | + where side = 'SELL' |
| 269 | +), |
| 270 | +cumulative_buy as ( |
| 271 | + -- transform the buy side into a cumulative sum with a window query |
| 272 | + select price, side, sum(quantity) over (order by price desc) as sum |
| 273 | + from orderdepth |
| 274 | + where side = 'BUY' |
| 275 | + |
| 276 | +) |
| 277 | +-- stitch together buy and sell side |
| 278 | +select * |
| 279 | +from (select * from cumulative_sell union all select * from cumulative_buy) |
| 280 | +order by price; |
| 281 | +``` |
| 282 | + |
| 283 | +### Candlesticks |
| 284 | + |
| 285 | +The probably most well-known chart type to show a stock's price history. |
| 286 | +Wikipedia has a [nice introduction](https://en.wikipedia.org/wiki/Candlestick_chart). |
| 287 | + |
| 288 | + |
| 289 | + |
| 290 | +The chart is generated by the following SQL query: |
| 291 | +```sql |
| 292 | +with limits as ( |
| 293 | + select 34200000000000::bigint as start, |
| 294 | + -- from 9:30 AM (in nanoseconds since midnight), i.e. the start of trading day |
| 295 | + 34200000000000 + (30*60)::bigint * 1000 * 1000 * 1000 as end, |
| 296 | + -- to 30*60 seconds = 30 minutes after market start |
| 297 | + 10::bigint * 1000 * 1000 * 1000 as step |
| 298 | + -- bin size of 10 seconds |
| 299 | +), bins as ( |
| 300 | + -- we always generate the bins from the start of the trading day so they are stable |
| 301 | + select generate_series(l.start ,l.end, l.step) as time |
| 302 | + from limits l |
| 303 | +), |
| 304 | +prices as ( |
| 305 | + -- For any order of a given stock executed within the relevant time span, find the price |
| 306 | + select |
| 307 | + e.timestamp as time, s.name as metric, |
| 308 | + -- if the execution itself has a price, it takes precedence |
| 309 | + max(coalesce(e.price, o.price)) as value, |
| 310 | + max(coalesce(e.price, o.price) * e.quantity) as volume |
| 311 | + from executions e, stocks s, orders o, limits l |
| 312 | + where e.orderid = o.orderid |
| 313 | + and o.stockid = s.stockid |
| 314 | + and s.name = 'AAPL' -- the stock ticker symbol we're interested in |
| 315 | + and e.timestamp >= l.start |
| 316 | + and e.timestamp < l.end |
| 317 | + group by e.timestamp, s.name |
| 318 | +), |
| 319 | +binned as ( |
| 320 | + select |
| 321 | + extract(epoch from current_date + (b.time/(1000*1000) * interval '1 millisecond')) as time, |
| 322 | + p.metric, |
| 323 | + first_value(p.value) over w as open, |
| 324 | + last_value(p.value) over w as close, |
| 325 | + max(p.value) over w as high, |
| 326 | + min(p.value) over w as low, |
| 327 | + sum(p.volume) over w as volume, |
| 328 | + row_number() over w as rn |
| 329 | + from prices p, bins b, limits l |
| 330 | + -- assign each event into its bin |
| 331 | + where p.time >= b.time and p.time < b.time + l.step |
| 332 | + -- for each bin, find the candle stick parameters with a window function |
| 333 | + window w as ( |
| 334 | + partition by b.time, p.metric |
| 335 | + order by p.time asc |
| 336 | + rows between unbounded preceding and unbounded following) |
| 337 | +) |
| 338 | +select metric, time, open, close, high, low, volume |
| 339 | +from binned b |
| 340 | +where rn = 1; |
| 341 | +``` |
| 342 | + |
| 343 | +## Putting it all together |
| 344 | + |
| 345 | +We have built a Grafana dashboard with the above visualizations plus a few more. |
| 346 | +The dashboard auto-refreshes multiple times per second to give an up-to-date view of the exchange state. |
| 347 | + |
| 348 | +The following video shows a live demo: |
| 349 | +<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/szKVXWWXQJg?si=XbXN9RPJ1Lz47dZ-" title="YouTube video player" frameborder="0" allow="accelerometer; clipboard-write; encrypted-media; gyroscope; picture-in-picture" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> |
| 350 | + |
| 351 | + |
| 352 | + |
| 353 | +If you want to try it out yourself, just clone our [examples repository](https://github.com/cedardb/examples/) and follow the Readme in the `nasdaq` subdirectory. |
| 354 | + |
0 commit comments