Why you should think twice before using Firebase Firestore

Before I get into why you should think twice about Firebase Firestore in production I would like to give you a brief overview of what I personally used Firestore for, and what I do like about Firestore (even for production).

I have been working on the app Simply Plural for about a year and a half now, and at time of writing Apparyllis, my other non-game-development company, have about 17k daily users. Firebase Firestore was a good start for the app, back when I never even intended to release this app to the public and was just making it for a friend. Firestore allowed me to create a prototype of functionality that would otherwise take me a considerably longer time. The integration of Firestore with other Google Cloud products was easy and allowed me to quickly utilize all the different components of Firebase to create a well-functioning small-scale app.

Firestore-first design

A lot of the functionality from Firestore is Firestore-first, not customer first. And this becomes clear in a lot of the limitations they have when you are trying to create basic functionality with Firestore.

Ignoring all the costs associated with Firestore that are specified below, it’s still not as an amazing product as it makes it out to be.

Case-sensitivity and usernames

To my surprise, after releasing the app to the public, we found that users were able to name themselves foo and Foo. This isn’t ideal as you (or at least I) typically want usernames to be case-insensitive. Our query to see if a username could be taken was by seeing if a document with a specific name already existed, but this doesn’t check for case-insensitivity, neither can you check for case-insensitivity in Firestore.

As a workaround we found that some other developers were storing all usernames as their intended casing, and as full lowercase in the database. This allowed them to query if the name already exists, which is a fair workaround for this issue. But when you’re a developer, you want to be able to query on case-insesitivity in a database that is meant to be used for production. This would also not be a feasible solution that truly scales if you have a lot of fields that are case-insensitive.

Query limitations

We had a specific query that required us to get all documents where a user-specified time-range overlaps a time-range stored in the database. In essence we were trying to query all documents with a time range that overlapped with another time range. This however proved to be difficult. You cannot use mixed comparisons (>= and <=) in a compound query (queries with more than one .where()) with more than one field, as stated by their own documentation.

The query for this, in our new backend is:

{
			$or: [
				{ startTime: { $gte: Number(req.query.startTime) }, endTime: { $gte: Number(req.query.endTime) } }, // starts after start, ends after end
				{ startTime: { $lte: Number(req.query.startTime) }, endTime: { $gte: Number(req.query.startTime) } }, //start before start, ends after start
				{ startTime: { $gte: Number(req.query.startTime) }, endTime: { $lte: Number(req.query.endTime) } }, // start after start, ends before end
				{ startTime: { $lte: Number(req.query.endTime) }, endTime: { $gte: Number(req.query.endTime) } } //Starts before end, ends after end
			]
		}

This simply isn’t possible in Firestore.

Exporting data from backups

Once in a while a user came along that messed up something in their account, or deleted their account by accident. They would contact us and ask if they can restore the mistake they did. We thought, okay we have backups we can do that! However, Firestore does not allow partial restore. You can only restore your full database, you can’t restore just a specific collection or document. You can also not read the backup in any of your text editors to try and recover data that way.

We felt sorry for one user in particular and we found one hacky way to retrieve that user’ data. We downloaded our backup and launched a local Firestore database using the backup, which was then able to show us the entire backed-up database.

Needless to say, this is a horrible way of trying to retrieve a user’s data. And on another note to this, once our backup grew to be larger than 2gb, the local instance of Firestore refused to launch and we could no longer restore user’ data if they messed something up.

Document schema and public API

One thing we wanted to do for a long time is creating a publicly available API for our users to use and extend the app with their own third-party-integrations. When we investigated doing this with Firestore we decided there isn’t any feasible approach of doing this. Sure you could make an API using cloud functions, but you would again be paying for compute time on top of read, write and delete access.

Another issue we ran into is that, if we don’t use some form of schema validation for the data input we receive from end-users, it is too easy to break your own data. For example if you want to rename your username from “Foo” to “Bar“, you can do this with a document write, but nothing is stopping users from entering a number, an array, an object into the username field.

If a user self-breaks the data, you could argue this is their issue, but in an ideal world an end-user cannot break their own account in an attempt to create a third party integration.

Changing the field from a String to a number on your user document would almost certainly break and bug the app for that specific user, and you would be getting misleading bug reports from your Crashlytics integration.

The cost of using Firestore

The cost of Firestore, at face value, seems reasonable and, in theory, it is. However, as per other sources, it is easy to scale the cost of your Firestore implementation into the thousands of dollars by human-error. But even if no human-error happened, there are still caveats that are worth mentioning.

When we left Firestore, we were getting a consistent 200$-300$ monthly bill for the operation of an app that had, at the time, around 10k daily users. And this was a price that seemed fair for the amount of daily users, however there were many ways we could have improved that cost, should Firestore have not been so… well you’ll read it below.

Cost analytics limitations

When using Firestore, regardless of whether you are creating an API or Cloud Functions, there are no analytics on which documents or common routes are used with Firestore. Firebase only presents you with how many reads happened, not the origin of them.

This caused a spike in cost during one of the months where we used Firestore, something in the app was making considerably more reads to documents than it should, and we were completely at a loss as to what could cause it.

Firestore itself has no built-in tools to see which parts of your database were often read, the Firestore SDK simply has your typical Read, Write and Delete operations/function calls for whatever document you want, but no way to log it.’

The only way we are able to log this is to, for every location in the app where we do a read, write or delete, to do an additional function call to log the document reads, writes, and delete that are being made made to Google Analytics.

Once the Google Analytics usage started rolling in, we had a better idea of where some of the cost was coming from, but we still had a considerable unaccouncted-for amount of reads coming from somewhere.

After a lot of research into every location where we make read calls, we found that a Cloud Function was the cause for this. Cloud Functions, however, cannot do logging to Google Analytics, so we found this out purely by combing the code with a toothpick to try and find every possible way we could get too many reads.

Once we attributed the Cloud Function as the source of the reads, we rectified this and the costs of the increased reads stopped coming in. However, there were still some unaccounted reads (more on that below).

The fear of abuse from end-users

Knowing that Firestore charges you every read, a user with malintent can easily cause a spike in reads performed on your database when you don’t use an API to route every read, write and delete (Which is one of Firestore’s selling points, the lack of an API necessity).

The way you communicate with Firestore is by using the Public API key from your app and query the results you need (access is protected by read/write rules) through the SDK. As this user on Reddit rightfully so asks, how do you protect your database from malintent queries?

Most users recommend Cloud Functions to serve as some sort of API to guard yourself against most malintent, this however requires you to:

Create some sort of API after all (which Firestore likes to market as redundant)

Pay for the usage of Cloud Functions on top of your read, write and deletes

Update your app to use these API calls

If access to certain documents is fully restricted to Cloud Functions only, you lose the ability to get Firestore live updates.

Even when you protect malintent through API calls, you w still being charged per-read when a malintended user uses your API endpoints by spamming them with a valid authentication.

Unexpected charges from Firestore

Firestore has made me go “Are you for real?” quite a few times as we navigated all the hidden costs of Firestore. Some hidden costs are for bare-minimum functionality that the usage of Firestore for anything serious can be put into question.

Console usage

In the very first days of my development with Firestore, we were getting a lot more reads than we should. We contacted Firestore about this. As a reply we got: “Navigating the document database in the Firestore Cloud Console will also incur read, write and delete costs”. Which means that by purely looking through the database in the online cloud console, we were getting reads from Firestore. This is, in comparison to actual app usage, quite low, however it’s still worth mentioning.

Backups

The next unexpected charge we were made aware of is that creating a backup of your document database by exporting all documents into a backup on a daily basis , incurs a read charge per document that is backed up. If you have a chat app, you can only imagine the amount of costs you’d incur on a daily basis just by doing the basic thing of backing up your database. In unison with Firestore liking to charge you for everything, should you ever want to restore from a backup, you will be charged one write per document, which are more expensive than reads.

Security rules

Another hidden cost of Firestore is that your security rules for guarding access to your documents also incurs a read charge when you perform any document data access of a document in a security rule.

I am quite sure that there are other hidden costs we have forgotten about (we did the transfer about a year ago). However, these alone should be enough to make you think twice about Firestore.

Synopsis

To summarize, we’ve gone through a lot of headache and unexpected costs by deciding to use Firestore for our project. In hindsight we should have done more research on the limitations and costs of Firestore before we went ahead and used it to launch our app with.

Today, without Firestore, we pay less than we did with Firestore and we have the flexibility of our own API, document schema validation, no hidden costs and any analytics you need to maintain the services you provide.

Our current stack is a load balancer that is optimized for production usage, with 2 API servers. A primary API server and a secondary API server to balance the load of the API calls. In addition we run a replica set of Mongodb with 3 servers (so elections can function if one goes down). We also run a separate server that serves the user-uploaded avatars (we currently host 2 million user-uploaded avatars at the time of writing) and a mail server that can send our emails for user-reports. Every server is currently being routed through Cloudflare for DDoS protection.

In total we have around 130.000 registered users in the app, and our servers are able to handle the load quite easily, even during peak times and during unexpected increased user counts.

Every server is currently hosted through Digital Ocean and we have to say we’re really happy with their service, so if you’re looking into getting a similar stack up and running, we strongly recommend them.

If you’ve made it this far, thank you for taking the time to read my not-so-short rant. I felt that the experiences we went through should be stated so that maybe, hopefully, at least one other developer can read this and save the time and headache.

If you still decide that Firestore is right for you and your needs, there’s nothing wrong with that. Our needs for what we needed Firestore to do for us didn’t match up, and as a small-time company, any costs that can’t be lowered due to service-limitations is just not an option for us.

Leave a Reply

Your email address will not be published.