POUND hack, take two, rebuilt with AWS services

Over the weekend I refactored my POUND project into services and put it on AWS. Repo is private for now, but I wanted to explain the architecture.

  1. The client, which handles a few jobs:
    • Fingerprint the browser
    • Prepare a pageview data object
    • Update the hash
    • Send the pageview

    83kb in compiled form, could probably get it sub-5okb by dropping the cookie library. Most of the page weight is in the fingerprinting. This site is currently carrying the client (you should see a hash in the URL above).

  2. AWS API Gateway. I ended up making this using a GET request because they are faster. I am passing my pageview data through headers. Using GET should also mean that the browser silently retries requests. API Gateway provides configuration for CORS and also comes with SSL standard.
  3. AWS Lambda function that takes the pageview event, does some light transformation/enrichment, and then puts a new item into DynamoDB.
  4. AWS DynamoDB table. Previously I had been using Firebase Storage but this approach should be more scalable. I flirted briefly with AWS Firehose into S3 & Redshift, but there was too much configuration required and I like the flexibility of NoSQL for now.
  5. Create React App/Vis.js. This part is pretty much the same as before, just moved to S3 and refactored for the new data supply. New link.

The visualization needs the most work right now. I hit the limits of Vis.js and really need to move into d3. The timeseries nature of data is missing currently, and I need to figure out a good way of conveying the “multiple pages” concept (or maybe it’ll be better to isolate to a single GUID at once). I’m also collecting user agent and IP address now for removing bots and geolocating later.

Posted Jun. 26 2017, 1:25 pm by davis