GAA


For anyone from Ireland, you'll immediately know what the GAA is. However for anyone on my blog from elsewhere the GAA is the Gaelic Athletic Association which is Irelands amateur sports organisation. It is also one of the largest amateur sports organisations in the world. If you're interested you can visit their website for more information.

Summary


I was a technical lead on this project and tasked with building a new membership service for the GAA with a team of 3 developers. If you're interested in how the project progressed, I've written about the entire process below. However here I will summarise what it entailed.

  1. Microservice architecture using kubernetes, docker and github actions for deployment. Our logging was all monitored via AWS cloud monitoring and all applcations were built with health endpoints to ensuring kubernetes could manage them appropriately via liveness and wellness probes.

  2. Web app built with React, which was eventually converted to Typescript. This was deployed onto AWS S3 with a cloudfront CDN.

  3. Java services built using Spring Boot. Protected by Jwt based authorisation using an OIDC server. Made use of libraries such as model mapper for DTO conversion, lombok for reduced boiler plate and hibernate for entity definitions.

  4. The API's were all driven by hypermedia href's and we followed the Richardson Maturity Model - Level 3 throughout the process. All naming conventions for the properties were based off of schema.org.

  5. We used an RDS Postgres instance, as we wanted to store some json schema files as JSONB.

  6. We used Keycloak as our OAUTH2 server, as it was easily configurable and integrated with both Spring and React easily.

Microservice approach


Initially the GAA had asked us to build a service for managing covid declarations. We had been given a month to build out the entire system, it would however give us the opportunity to onboard hundreds of thouands of users immediately. We knew however we would need to build out additionally services in the near future to support membership payments and more, Therefore from the get go we went with a microservice based architecture.

For the microservice architecture approach we opted for kubernetes on AWS as we didn't have a dev ops team and kubernetes is great at container orchestration. You get quite a few features for free or with little effort thanks to helm charts and the cost can easily be managed through resource limits.

As for the technology choice, the majority of our software team was already familiar with Java and Spring so we opted for spring boot based microservices to ensure we could devlier the application on time. I would have loved to explore kotlin for some of our microservices but the rest of the team weren't as interested. We also wanted to apply some API best practices, so we opted to follow schema.org naming conventions and use HATEOAS as we wanted API's to naturally lead to other API's users may well want to consume.

OAuth Service


I had decided to run a cost analysis across an abundance of oauth software product based on our estimated number of users being around 400,000. I recall the cost of the SAAS products such as OKTA and AUTH0 being extremely high. Cognito was slightly more reasonable but still expensive, therefore we opted to run a spike for keycloak.

Keycloak seemed like a great tool, it would provide a lot of functionality for free for the GAA's support team with its ability to impersonate users allowing them to investigate issues. It would also be very cheap given that its open source and we'd only have to pay for the cost of the kubernetes services we used to host it. On top of this it also had a helm chart available we could run in HA mode, although this would cost a little more it was well worth it for the availability benefits.

Keycloak allows web engineers to build custom themes in freemarker templates (similar to JSP) and a lot of the UI team were not familiar with it which was a bit of downside. It was also quite difficult for them to test, although I eventually wrote a docker compose file that used an in memory database making a little easier. Below you can see some examples of the login/register screens.

Milestone 1


As mentioned earlier we had been tasked with delivering a project to allow teams to manage covid forms. In a nutshell this required a person service for profiles, an organisation service for all the GAA organisatons, an oauth service to manage access and a web application. We also opted to keep the teams/covid forms data within the person application as it was quite a small feature and the covid forms would most likely be removed eventually.

After some intense crunch time we managed to deliver the application in its initial stages, this allowed users to register for a new account. They could associate themselves with a specific organisation and then join a team (for example Belfast GAA team - under 16's team). They could then create and manage covid forms ensuring everyone would be protected at training sessions and games. You can see examples of these below.

Initial Issues


The launch went well enough but there were significant load issues, we hadn't anticipated the level of traffic we received. We didn't want to enable autoscaling due to strict budgets around hosting costs, so we just upgraded our RDS instance and added more kubernetes nodes to the services that were causing the most load.

After the initial registration load had softened we had time to investigate SQL issues through the AWS database performance insights tool. We noticed a number of composite indexes were missing and adding those provided some quick wins. However some queries were still not using their indexes due to having subqueries, we changed the format of these queries and got a significant performance increase.

We also spent some time configuring the database connection pool settings, this was because we noticed sometimes the applications were restarting due to lack of available db connections. We also noticed the http calls between services were quite substantial so we added in some reference tables that would be updated through kafka calls, ensuring the data is fresh.

Milestone 2


Now that we had around 400,000 users and the services were stable it was time to progress to phase 2. This milestone would involve building our membership systems which included the ability to register for families. We would need to provide organisations with administrator accounts, allowing them to create an abundance of packages users could choose from.

We decided to use stripe connect to manage memberships and allow admins to create their own payment gateways and json schema for membership packages. Admins could follow a simple flow to get their organizations active on the GAA platform and then register custom membership packages which were stored as json schema files to be validated against later. We allowed quite significant customisation on their part, the type of membership offering (subscription/single payment), the length of the subscription, the price and more. The entire flow can be seen below:

At this point the application had 2 additional microservices, a membership service and a payment service. The payment service was made up of a call for registering organisations to stripe and a webhook callback that allowed stripe to fire all events back to our application. We saved all these events locally when they came through and processed them with a scheduled job. Once the scheduled job fired it would fire an event through kafka that would be picked up by the membership service and updated accordingly. We had split the event into two parts, payment request received and charge successful/failed.

Users could now see their active memberships and admins could ensure that players for each teams had all paid their membership fee successfully. We delivered these features after a few months and since then tens of millions have been processed in membership fees. At this point it felt like a successful delivery, the GAA were happy with the product and I was happy to move onto my next challenge.

Milestone 2 Issues


Milestone 2 had one significant issue, the event processing had a race condition that we didn't notice, sometimes the application would process the charge event before the payment event and the way we had handled the logic meant the users membership would be set back to a PENDING state. We fixed this race condition using the unique end to end event ID and ensuring that it was processing them in the correct order. However some of the data was still out of sorts even after we reran all the affected events. We had to write quite a large SQL script to update the afflicted accounts, luckily we also had snapshotting on the database in case something went wrong. However it all worked out well thankfully.