Why you should think twice before using Firebase Firestore

Before I get into why you should think twice about Firebase Firestore in production I would like to give you a brief overview of what I personally used Firestore for, and what I do like about Firestore (even for production).

I have been working on the app Simply Plural for about a year and a half now, and at time of writing Apparyllis, my other non-game-development company, have about 17k daily users. Firebase Firestore was a good start for the app, back when I never even intended to release this app to the public and was just making it for a friend. Firestore allowed me to create a prototype of functionality that would otherwise take me a considerably longer time. The integration of Firestore with other Google Cloud products was easy and allowed me to quickly utilize all the different components of Firebase to create a well-functioning small-scale app.

Firestore-first design

A lot of the functionality from Firestore is Firestore-first, not customer first. And this becomes clear in a lot of the limitations they have when you are trying to create basic functionality with Firestore.

Ignoring all the costs associated with Firestore that are specified below, it’s still not as an amazing product as it makes it out to be.

Case-sensitivity and usernames

To my surprise, after releasing the app to the public, we found that users were able to name themselves foo and Foo. This isn’t ideal as you (or at least I) typically want usernames to be case-insensitive. Our query to see if a username could be taken was by seeing if a document with a specific name already existed, but this doesn’t check for case-insensitivity, neither can you check for case-insensitivity in Firestore.

As a workaround we found that some other developers were storing all usernames as their intended casing, and as full lowercase in the database. This allowed them to query if the name already exists, which is a fair workaround for this issue. But when you’re a developer, you want to be able to query on case-insesitivity in a database that is meant to be used for production. This would also not be a feasible solution that truly scales if you have a lot of fields that are case-insensitive.

Query limitations

We had a specific query that required us to get all documents where a user-specified time-range overlaps a time-range stored in the database. In essence we were trying to query all documents with a time range that overlapped with another time range. This however proved to be difficult. You cannot use mixed comparisons (>= and <=) in a compound query (queries with more than one .where()) with more than one field, as stated by their own documentation.

The query for this, in our new backend is:

{
			$or: [
				{ startTime: { $gte: Number(req.query.startTime) }, endTime: { $gte: Number(req.query.endTime) } }, // starts after start, ends after end
				{ startTime: { $lte: Number(req.query.startTime) }, endTime: { $gte: Number(req.query.startTime) } }, //start before start, ends after start
				{ startTime: { $gte: Number(req.query.startTime) }, endTime: { $lte: Number(req.query.endTime) } }, // start after start, ends before end
				{ startTime: { $lte: Number(req.query.endTime) }, endTime: { $gte: Number(req.query.endTime) } } //Starts before end, ends after end
			]
		}

This simply isn’t possible in Firestore.

Exporting data from backups

Once in a while a user came along that messed up something in their account, or deleted their account by accident. They would contact us and ask if they can restore the mistake they did. We thought, okay we have backups we can do that! However, Firestore does not allow partial restore. You can only restore your full database, you can’t restore just a specific collection or document. You can also not read the backup in any of your text editors to try and recover data that way.

We felt sorry for one user in particular and we found one hacky way to retrieve that user’ data. We downloaded our backup and launched a local Firestore database using the backup, which was then able to show us the entire backed-up database.

Needless to say, this is a horrible way of trying to retrieve a user’s data. And on another note to this, once our backup grew to be larger than 2gb, the local instance of Firestore refused to launch and we could no longer restore user’ data if they messed something up.

Document schema and public API

One thing we wanted to do for a long time is creating a publicly available API for our users to use and extend the app with their own third-party-integrations. When we investigated doing this with Firestore we decided there isn’t any feasible approach of doing this. Sure you could make an API using cloud functions, but you would again be paying for compute time on top of read, write and delete access.

Another issue we ran into is that, if we don’t use some form of schema validation for the data input we receive from end-users, it is too easy to break your own data. For example if you want to rename your username from “Foo” to “Bar“, you can do this with a document write, but nothing is stopping users from entering a number, an array, an object into the username field.

If a user self-breaks the data, you could argue this is their issue, but in an ideal world an end-user cannot break their own account in an attempt to create a third party integration.

Changing the field from a String to a number on your user document would almost certainly break and bug the app for that specific user, and you would be getting misleading bug reports from your Crashlytics integration.

The cost of using Firestore

The cost of Firestore, at face value, seems reasonable and, in theory, it is. However, as per other sources, it is easy to scale the cost of your Firestore implementation into the thousands of dollars by human-error. But even if no human-error happened, there are still caveats that are worth mentioning.

When we left Firestore, we were getting a consistent 200$-300$ monthly bill for the operation of an app that had, at the time, around 10k daily users. And this was a price that seemed fair for the amount of daily users, however there were many ways we could have improved that cost, should Firestore have not been so… well you’ll read it below.

Cost analytics limitations

When using Firestore, regardless of whether you are creating an API or Cloud Functions, there are no analytics on which documents or common routes are used with Firestore. Firebase only presents you with how many reads happened, not the origin of them.

This caused a spike in cost during one of the months where we used Firestore, something in the app was making considerably more reads to documents than it should, and we were completely at a loss as to what could cause it.

Firestore itself has no built-in tools to see which parts of your database were often read, the Firestore SDK simply has your typical Read, Write and Delete operations/function calls for whatever document you want, but no way to log it.’

The only way we are able to log this is to, for every location in the app where we do a read, write or delete, to do an additional function call to log the document reads, writes, and delete that are being made made to Google Analytics.

Once the Google Analytics usage started rolling in, we had a better idea of where some of the cost was coming from, but we still had a considerable unaccouncted-for amount of reads coming from somewhere.

After a lot of research into every location where we make read calls, we found that a Cloud Function was the cause for this. Cloud Functions, however, cannot do logging to Google Analytics, so we found this out purely by combing the code with a toothpick to try and find every possible way we could get too many reads.

Once we attributed the Cloud Function as the source of the reads, we rectified this and the costs of the increased reads stopped coming in. However, there were still some unaccounted reads (more on that below).

The fear of abuse from end-users

Knowing that Firestore charges you every read, a user with malintent can easily cause a spike in reads performed on your database when you don’t use an API to route every read, write and delete (Which is one of Firestore’s selling points, the lack of an API necessity).

The way you communicate with Firestore is by using the Public API key from your app and query the results you need (access is protected by read/write rules) through the SDK. As this user on Reddit rightfully so asks, how do you protect your database from malintent queries?

Most users recommend Cloud Functions to serve as some sort of API to guard yourself against most malintent, this however requires you to:

Create some sort of API after all (which Firestore likes to market as redundant)

Pay for the usage of Cloud Functions on top of your read, write and deletes

Update your app to use these API calls

If access to certain documents is fully restricted to Cloud Functions only, you lose the ability to get Firestore live updates.

Even when you protect malintent through API calls, you w still being charged per-read when a malintended user uses your API endpoints by spamming them with a valid authentication.

Unexpected charges from Firestore

Firestore has made me go “Are you for real?” quite a few times as we navigated all the hidden costs of Firestore. Some hidden costs are for bare-minimum functionality that the usage of Firestore for anything serious can be put into question.

Console usage

In the very first days of my development with Firestore, we were getting a lot more reads than we should. We contacted Firestore about this. As a reply we got: “Navigating the document database in the Firestore Cloud Console will also incur read, write and delete costs”. Which means that by purely looking through the database in the online cloud console, we were getting reads from Firestore. This is, in comparison to actual app usage, quite low, however it’s still worth mentioning.

Backups

The next unexpected charge we were made aware of is that creating a backup of your document database by exporting all documents into a backup on a daily basis , incurs a read charge per document that is backed up. If you have a chat app, you can only imagine the amount of costs you’d incur on a daily basis just by doing the basic thing of backing up your database. In unison with Firestore liking to charge you for everything, should you ever want to restore from a backup, you will be charged one write per document, which are more expensive than reads.

Security rules

Another hidden cost of Firestore is that your security rules for guarding access to your documents also incurs a read charge when you perform any document data access of a document in a security rule.

I am quite sure that there are other hidden costs we have forgotten about (we did the transfer about a year ago). However, these alone should be enough to make you think twice about Firestore.

Synopsis

To summarize, we’ve gone through a lot of headache and unexpected costs by deciding to use Firestore for our project. In hindsight we should have done more research on the limitations and costs of Firestore before we went ahead and used it to launch our app with.

Today, without Firestore, we pay less than we did with Firestore and we have the flexibility of our own API, document schema validation, no hidden costs and any analytics you need to maintain the services you provide.

Our current stack is a load balancer that is optimized for production usage, with 2 API servers. A primary API server and a secondary API server to balance the load of the API calls. In addition we run a replica set of Mongodb with 3 servers (so elections can function if one goes down). We also run a separate server that serves the user-uploaded avatars (we currently host 2 million user-uploaded avatars at the time of writing) and a mail server that can send our emails for user-reports. Every server is currently being routed through Cloudflare for DDoS protection.

In total we have around 130.000 registered users in the app, and our servers are able to handle the load quite easily, even during peak times and during unexpected increased user counts.

Every server is currently hosted through Digital Ocean and we have to say we’re really happy with their service, so if you’re looking into getting a similar stack up and running, we strongly recommend them.

If you’ve made it this far, thank you for taking the time to read my not-so-short rant. I felt that the experiences we went through should be stated so that maybe, hopefully, at least one other developer can read this and save the time and headache.

If you still decide that Firestore is right for you and your needs, there’s nothing wrong with that. Our needs for what we needed Firestore to do for us didn’t match up, and as a small-time company, any costs that can’t be lowered due to service-limitations is just not an option for us.

Unreal Engine, and the hidden pitfalls of Blueprints

Hello!

Blueprints are useful, quick, and easy to use once you get used to them.

Event graphs are incredibly helpful to creating certain gameplay, cross-function references are particularly useful in certain scenarios.

However useful and quick Blueprints are, there are quite a few hidden pitfalls that can unintentionally cause avoidable massive performance drops. Some pitfalls that Blueprints offer can actually cause bugs that aren’t very straightforward to find unless you know of some the inner workings of Blueprints.

Explaining all pitfalls of Blueprints in one blog post would be an unsurmountable feat, thus I have decided to make this a series that will look at specific pitfalls in Blueprints and document each one in detail.


Pure Performance Pitfalls with Loops

When it comes to performance, Blueprints are generally slower than c++, but Blueprints can still be useful, and complex functions can still be created if you know what to look out for.

First and foremost, one of the major causes of performance loss in blueprints: Pure functions (Those without exec pins such as “GetRandomPointInNavigableRadius “).

They’re handy, they’re neat, and don’t require you to hook up an execution pin! But; alas these pure functions come with a lot of pitfalls if used incorrectly. Take the following example:

A function that checks if the result of GetRandomPointInNavigableRadius is valid and then prints the result. This prints a random point around -100,-100,-100

This looks harmless, but unfortunately, this will call the function “GetRandomPointInNavigableRadius” twice. That is because pure functions run once per connection to a node. It runs once to check the Return Value and it runs once to Print String.

It seems quite counterintuitive, but it will make more sense if you take a look at the following example:

A function that checks if the result of GetRandomPointInNavigableRadius is valid, updates Origin, and then prints the result of the pure function. This prints a random point around 200, 200, 200

Initially, we set Origin to -100,-100,-100, we get a random point in radius, we check if this was successful. It was? Great let’s print it, but before we do so, let’s change the Origin to something else. Now, if we Print String, it will run the pure function again

Blueprints don’t know in the first example that you didn’t change the origin between the Branch and Print String, neither does it know you changed the Origin in the second example to something else before running the Print String.

Because Blueprints don’t know and don’t check, they always run once per connection, regardless if changes were made that would have no consequences to the return values.

Let’s put this theory to the test and proof it.

To proof this theory, do the following: Create a pure function that just returns the number 20, but add a print string to the function, so that every time it runs, it will print your string.

A Const Pure function that prints “I was run!” and returns the value 20

Now, create a function that has a simple ForLoop and plug the PureProof function into “LastIndex”. Then print the index you have.

Am event that runs ForLoop from 0 to the return value of PureProof.

If you run this function, you will get 43 prints in the log.
21 times the print string inside the PureProof
21 times the print string after the loop
1 final time it runs the PureProof after the last index was run to decide not to run anymore because it reached the final index.

LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 0
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 1
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 2
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 3
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 4
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 5
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 6
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 7
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 8
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 9
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 10
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 11
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 12
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 13
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 14
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 15
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 16
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 17
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 18
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 19
LogBlueprintUserMessages: [Untitled_C_4] I was run!
LogBlueprintUserMessages: [Untitled_C_4] 20
LogBlueprintUserMessages: [Untitled_C_4] I was run!

Now, this is quite concerning. But for the purposes of this example, this is not a worrisome scenario, the pure function just returns a value, it doesn’t do anything expensive.

Now, imagine you do the following in PureProof.

A pure const function that runs a loop of 0-20 range and adds +1 on Final Value before returning the ReturnValue.

This function, would still be called 21 + 1 times, however; the loop inside of PureProof would also run 21 + 1 times, which can be a major performance issue if you are doing any sort of expensive checks in here, or are doing it to return a filtered result of an array.

The reason it does this is because ForLoop is not a function, but a macro, a macro that checks the length of last index every iteration, and each iteration it will run PureProof to get the last index value.

The inside of the macro ForLoop

Now let’s take a look at a much more costly pure function.

A loop that gets all characters of a string and prints each character

Now, this initially looks quite harmless, it goes over every character of the array and prints each character. However, this is actually a cause for concern.

As we know now, pure functions call once per connection, and in the case of a ForEachLoop it will call twice per iteration.

Inside the ForEachLoop there are two main nodes that cause this behavior. “Length” and “Get”.

The inside of the For Each Loop macro

Each iteration, the macro checks if the loop counter inside of the loop is smaller than Length of the array. So every iteration it has to get the length of the array. If the check passed, we do a Get, this gets the array and gets the object from the array at the index.

Take a look at the following example:

A pure const function that prints “I have run!” and returns an array of 0-5 range each with their index as value.
An event that calls ForEachLoop with PureArrayProof as the input, printing the element each iteration.

Running this code ends up running the PureArrayProof function 13 times.
12 times for all iterations, 2 per iteration. One additional time to check if we reached the length of the array (and we did).

LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 0
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 1
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 2
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 3
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 4
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] I have run!
LogBlueprintUserMessages: [Untitled_C_8] 5
LogBlueprintUserMessages: [Untitled_C_8] I have run!

I hope that at this point you can see how loops and pure functions can be a devestating effect on your performance. Let’s take this to the extreme and re-create a scenario that I have seen in a project once, which was cause for quite a heavy performance impact.

A pure const function that runs over x amount of elements, does an arbitrary expensive check and when it passes it adds the property to an array, once complete it returns the array.
A pure const function that runs over the filtered array and does another expensive check per array element, and adds it to a local array, when done returning the array.

Then running the following:

An event that calls ForEachLoop with PureArrayProof as the input, printing the element each iteration.

This causes the function “GetFilteredArray” to run an incredibly high amount of times. The GetFilteredArray function loops over 0-40 range, let’s say it loop 40 times for the sake of simplicity when calculating the amount of times this runs.

Let’s say it filters 50/50, so we end up with 20 filtered results inside “PureArrayProof“. It again filters 50/50, ending up ith 10 results when we run “PureArrayProof” in the event graph.

Knowing, that a ForEachLoop runs twice per execution + 1, we know that the function “PureArrayProof” will run 21 times before being done with work.

Knowing that PureArrayProof runs 21 times, we know that “GetFilteredArray” will run 41 times per iteration of PureArrayProof.

This ends up to running GetFilteredArray 861 times (21 x 41).

Knowing that GetFilteredArray runs 861 times, inside GetFilteredArray we run ForLoop 40 loops, which does ExpensiveCheckFunction.
Knowing it runs 40 loops, it will execute ExpensiveCheckFunction 41 times per time that GetFilteredArray is run.

This results in running ExpensiveCheckFunction a whopping 35,301 (41 x 861 ) times for the “simple” print string to run 10 times in the event graph.


Now how do we solve this?

Simply cache the result of “GetFilteredArray” and “PureArrayProof ” before running the ForEachLoop inside the event graph. It seems like such a simple fix for such a complex problem, but this really is all you have to do.

Alternatively, don’t make the function pure and simply let the node have exec pins (I advise doing this for anything that is remotely expensive to avoid accidentally running into this issue).

Execute Function Behavior Tree Task

Tldr: I created a behavior tree task that lets you execute a function of your choosing on the target blackboard key object. Code snippets at the bottom, relevant files here on GitHub.

Hey everyone! Yesterday I was working on some AI in Unreal Engine and I found myself often having to make a task just to do a simple function call in the AI-controlled pawn, which is rather pointless if all you do in that task is execute a function and finish execution. So I decided to make a single task that does this for me.

Final result

General behavior of the node:
You create the node, select the target class you want to target with this node, and from a dropdown you can select all available UFunctions of this class. I personally filter UFunctions that have a return val or params out to more easily find the function I want to execute.

Technical behavior of the node:
I created a new struct FFucntionContext that takes a TSubClassOf<UObject> and an FString, the TSubClassOf<UObject> is used to find the class we want to target and the Fstring is to store the function to execute.

In an editor plugin, I created a IPropertyTypeCustomization for this struct called FFunctionContextCustomization, for the TSubClassOf<UObject> I just generate the value widget as I don’t need to modify that. But for the string, I create a dropdown that has as OptionsSource the available and filtered UFunctions. We can do this by getting the value of the property as a formatted string and find the UClass. From the UClass we can iterate over all UFunctions this class has and get their name and flags, and other useful information to determine whether or not this UFunction is relevant for me.

Full code can be found at: https://github.com/CelPlays/SaltyAI

Customizing the Property

To get a custom dropdown for the property you need to create an IPropertyTypeCustomization class for your struct and override CustomizeHeader. Once you did that you have full control over the Slate visuals and behavior of your struct property.

For the TSubClassOf property, I create the default property widget in the ValueContent of the HeaderRow and for the string property, I create a dropdown which gets filled when you select a class.

I also bind to SetOnPropertyValueChanged so that when the class changes the actual array for the options will update and you can reselect your class.

void FFunctionContextCustomization::CustomizeHeader(TSharedRef<class IPropertyHandle> StructPropertyHandle, class FDetailWidgetRow& HeaderRow, IPropertyTypeCustomizationUtils& StructCustomizationUtils)
{
	ClassProperty = StructPropertyHandle->GetChildHandle("ContextClass");
	FunctionProperty = StructPropertyHandle->GetChildHandle("FunctionToExecute");

	FSimpleDelegate OnChange;
	OnChange.BindRaw(this, &FFunctionContextCustomization::OnClassChange);
	ClassProperty->SetOnPropertyValueChanged(OnChange);

	OnClassChange();

	HeaderRow.NameContent()
	[
		StructPropertyHandle->CreatePropertyNameWidget(FText::FromString("Function Context"))
	]
	.ValueContent()
	.MinDesiredWidth(500)
	[
		SNew(SVerticalBox)
		+ SVerticalBox::Slot()
		[
			ClassProperty->CreatePropertyValueWidget()
		]
		+ SVerticalBox::Slot()
		[
			SAssignNew(ComboBox, SComboBox<FComboItemType>)
			.OptionsSource(&Functions)
			.OnSelectionChanged(this, &FFunctionContextCustomization::OnSelectionChanged)
			.OnGenerateWidget(this, &FFunctionContextCustomization::MakeWidgetForOption)
			[
				SNew(STextBlock)
				.Text(this, &FFunctionContextCustomization::GetActiveFunction)
			]
		]
	];

	UpdateActiveSelectedClass();
}

Iterating UFunctions

Here I go over all UFunctions of the ContextClass. I only get functions that are BlueprintCallable, have no return value, does not have a native implementation and has no params. Obviously these rules are on a per-project basis and you can make these whatever you want.

for (TFieldIterator<UFunction> FuncIt(ContextClass); FuncIt; ++FuncIt)
{
	UFunction* Function = *FuncIt;

	//Only blueprint callable
	if (!Function->HasAnyFunctionFlags(FUNC_BlueprintCallable))
	{
		continue;
	}

	//Ignore return val function
	if (Function->GetReturnProperty())
	{
		continue;
	}

	//Ignore native functions
	if (Function->HasAnyFunctionFlags(FUNC_Native))
	{
		continue;
	}

	//If function has params ignore the function
	if (Function->NumParms > 0)
	{
		continue;
	}

	Functions.Add(MakeShareable(new FString(Function->GetName())));
}

Executing the UFunction

You can execute the UFunction by first finding the UFunction and then processing the event. I also added a boolean for bReturnSuccessIfInvalid, which will return success even if the function call wasn’t successful.

UBlackboardComponent& BlackboardComp = *OwnerComp.GetBlackboardComponent();
UObject* Object = BlackboardComp.GetValueAsObject(Target.SelectedKeyName);

if (!Object)
{
	return bReturnSuccessIfInvalid ? EBTNodeResult::Succeeded : EBTNodeResult::Failed;
}

UFunction* Func = Object->FindFunction(FName(*FunctionContext.FunctionToExecute));

if (Func)
{
	Object->ProcessEvent(Func, nullptr);
}
else
{
	return bReturnSuccessIfInvalid ? EBTNodeResult::Succeeded : EBTNodeResult::Failed;
}

return EBTNodeResult::Succeeded;

Prerequisites

To get this to work you also have to register your IPropertyTypeCustomization in StartupModule() of your editor module. You can do this with the following code:

FPropertyEditorModule& PropertyModule = FModuleManager::LoadModuleChecked<FPropertyEditorModule>("PropertyEditor");

//Custom properties
PropertyModule.RegisterCustomPropertyTypeLayout("FunctionContext", FOnGetPropertyTypeCustomizationInstance::CreateStatic(&FFunctionContextCustomization::MakeInstance));

Also, make sure you add PropertyEditor as a dependency in your editor module. For your Runtime Module you want to add AIModule and GameplayTasks as a dependency. You have to add GamepayTasks as a dependency, otherwise it won’t compile.

Classifier – Quickly look up header/module info on classes

EDIT: I turned the server offline that was running this.

Tldr: Go here to find module/header info: http://classifier.celdevs.com

Because I was tired of scrolling to the bottom of the API pages of Unreal Engine to copy the module or header path I created a tool that does it for me.

All you need to do is type in a class name AActor, UWorld, FCoreDelegates, TArray and click “Gimme dat!” and it’ll provide you with the proper module/header.

If you wanna see a video of it working you can see that here: https://twitter.com/CatherineCel/status/1136344909244436480

To actually use the tool go here: http://classifier.celdevs.com

Procedural Island Generation UE4

Hi! Today I’m going into details of the procedural island generation I posted on my Twitter 🙂

To start off, a few great resources I’ve personally read that will help your understanding of terms and tools we’re using.

Awesome Blog post on Perlin Noise.
Complete resource on Hexagonal Grids.
Resource on Biome/Elevation.

I strongly suggest reading the above resources to get a good background knowledge of what I’ve used myself.

For the Perlin Noise I use the UnrealFastNoise from Chris Ashworth.

While this is a hexagonal grid, all of the code is easily transformed into square grids, so if you want a natural map that isn’t tied to hexagons you can still follow this guide.

High-Level process

The order in which I generate the island is as following:

  • Generate entire flat hexagonal grid
  • Mark edge tiles as ocean
  • Divide Island and Ocean (Create island shape)
  • Smoothen the island
  • Flood-fill to differentiate lake and ocean
  • Calculate tile distance from shore
  • Create elevation (disabled in my preview)
  • Assign biomes
  • Smoothen the biomes
  • Generate Flora (trees)
  • Generate Resources

The order of the generation isn’t largely important, however, some are a prerequisite for others such as assigning biomes before generation flora because flora is dependent on biomes. But you can easily first generate resources and then the flora.

Noise

Throughout the post I use a lot of Noise->GetValue2D(x,y), this is how I get the Noise object.

UUFNNoiseGenerator* Noise = UUFNBlueprintFunctionLibrary::CreateNoiseGenerator(this, ENoiseType::Simplex, Euclidean, CellValue, FBM, EInterp::InterpQuintic, SeedValue, 5, FrequencyValue, 2.f, .5f);

Generating the grid

The generation of the grid is entirely done with the Procedural Mesh Component from Unreal Engine.

To start the generation we iterate over the grid x size and grid y size and calculate the position of that tile, then add 7 vertices (one for the center and one per corner). And after that, we add 6 triangles. We store this information in a struct per tile (indexes of the vertices, indexes of the triangles) so we can, later on, use it to assign biomes and elevate the terrain.

The actual calculation of the location of each vertex is neatly explained on this blog post.

for (int x = 0; x < xGridSize; x++)
{
      for (int y = 0; y < yGridSize; y++)
      {
        // Calculate hexagon location and offset by actor location
        int32 UseIndex = x * xGridSize + y;

        //Create TSharedPtr of the tile
        Tiles[UseIndex] = MakeShareable(new FLevelTile(UseIndex, x, y));

        //Actually create the tile
        AddTileToMesh(x, y);
    }
}
void ALevelGenerator::AddTileToMesh(const int32& x, const int32& y)
{
	//Get center of the tile
	FVector Center = FVector::ZeroVector;
	GetCenterForTile(x, y, Center);

	//Always add center first
	AllUVS.Add(FVector2D(0.25, 0));
	int32 CenterIndex = AllVertices.Add(Center);

	//Get locations for each corner
	FVector a = GetCornerForTile(Center, NORTH);
	FVector b = GetCornerForTile(Center, NORTHEAST);
	FVector c = GetCornerForTile(Center, SOUTHEAST);
	FVector d = GetCornerForTile(Center, SOUTH);
	FVector e = GetCornerForTile(Center, SOUTHWEST);
	FVector f = GetCornerForTile(Center, NORTHWEST);

	//Add vertice
	int32 A = AddVertice(a);
	int32 B = AddVertice(b);
	int32 C = AddVertice(c);
	int32 D = AddVertice(d);
	int32 E = AddVertice(e);
	int32 F = AddVertice(f);

	//Create triangles
	int32 AT = AddTriangle(B, A, CenterIndex);
	int32 BT = AddTriangle(C, B, CenterIndex);
	int32 CT = AddTriangle(D, C, CenterIndex);
	int32 DT = AddTriangle(E, D, CenterIndex);
	int32 ET = AddTriangle(F, E, CenterIndex);
	int32 FT = AddTriangle(A, F, CenterIndex);
      
  //Add all A-F and AT-FT data into the tile struct
}

To generate the mesh (after doing biome, resources, and flora) you need to add a ProceduralMeshComponent to your actor and have the following arrays in your class:

TArray<FVector> AllVertices;
TArray<int32> AllTriangles;
TArray<FVector2D> AllUVs;
TArray<FLinearColor> AllColors; //I leave this empty

AllVertices and AllTriangles get filled during the generation of the flat hexagonal grid.

To spawn the generated level use the code below. To be able to do that you need to include “ProceduralMeshComponent.h” and add “ProceduralMeshComponent” to your DependencyModuleNames.

for (int i = 0; i < AllVertices.Num(); i++)
{
    //I want all normals to be up, but you can change this if you want to
    AllNormals.Add(FVector::UpVector);
}

LevelMesh->CreateMeshSection_LinearColor(0, AllVertices, AllTriangles, AllNormals, AllUVS, AllColors, TArray<FProcMeshTangent>(), false);

Creating the island shape

To create the island shape I create a simple Perlin Noise and apply a radial alpha on top of it. This will lerp out the edges of the noise to a darker tone and if we then use a MinAlpha then this will effectively create an island-like shape. Here’s the end result of an island noise map.

The dark spots in the center of the noise map become oceans tiles as well and later on we flood-fill to determine if it’s a lake. (A lake is defined as no connection with the border without crossing land)

//Map X index to 0 -1 
float UseX = (float)x/ (float)xGridSize; 

//Map Y index to 0 -1 
float UseY = (float)y/ (float)yGridSize; 

//Get Noise at X and Y
float NoiseValue = Noise->GetNoise2D(UseX , UseY); 

//In our case the noise returns -1 to 1, we want to map it to 0 to 1
float Value = MapRangeClamped(NoiseValue, -1.f, 1.f, 0.f, 1.f);

//Get our distance from the center rounded to nearest whole
int Distance = Round((CenterLocation - FVector2D(x, y)).Size());

//The further away from center the less likely it is to be island
float RadialAlpha = MapRangeClamped(Distance, 0, MaxDistanceFromCenter, 1.f, 0.f);

//Apply this value to our mapped NoiseValue(Value) which will fake a radial alpha 
const float FinalValue = Value * RadialAlpha;

if (value > IslandMinAlpha)
{
  MarkTileAsType(GRASSLAND, TileIndex);
}
else
{
  MarkTileAsType(OCEAN, TileIndex);
}

Generating Biomes

To generate biomes I use 3 noise maps (One for Tundra, one for Woodland and one for Snowland).

We take the value we get from the noise map (0 to 1) and round it to nearest whole number(0.0-0.49 = 0, and 0.50-1.0 = 1), this will create very hard edged shapes which is useful for biomes.

For Tundra and Woodland we create an RGB value (Red = Tundra, Green = Woodland, Blue = nothing).

Based on the frequency of the Perlin noise per biome and different weight of the biome it can create different shapes and sizes.

With the RGB value we can determine if this tile is either black(Grassland), green(Woodland) or red(Tundra), and overlap of both colors is still woodland. Example of a biome noise map below.

For our purposes we want the Snowland biome to always be in the center of the map and always a minimum radius so to do that we get distance from center and if that distance from center is smaller than SnowRadius then it will always be snow, if distance is larger than SnowRadius but smaller than SnowRadius*2 then it will use a radial alpha on the snow noise map to create a non-round falloff around the snow biome.

if (Tile->GetGroundType() == GRASSLAND || Tile->GetGroundType() == WATER)
{

//Get tundra noise
float fTundraNoise = TundraNoise->GetNoise2D(x / xGridSize, y / yGridSize);
//Get woodland noise
float fWoodlandNoise = WoodlandNoise->GetNoise2D(x / xGridSize, y / yGridSize);
//Get snowland noise
float fSnowlandNoise = SnowlandNoise->GetNoise2D(x / xGridSize, y / yGridSize);

//Tundra final value
float R = MapRangeClamped(TundraNoise , -1.f, 1.f, 0.f, 1.f) * TundraWeight; 
//Woodland final value
float G = MapRangeClamped(fWoodlandNoise , -1.f, 1.f, 0.f, 1.f) * WoodlandWeight; 
//Snowland final value
float A = MapRangeClamped(fSnowlandNoise , -1.f, 1.f, 0.f, 1.f);  

//Round R and G to nearest whole number
R = RoundToInt(R);
G = RoundToInt(G);

//Get Distance from center of the map in tile distance
int Distance = RoundToInt((CenterLocation - FVector2D(x, y)).Size());

//Get radial addition for snowland
float RadialAddition= MapRangeClamped(Distance, SnowRadius, SnowRadius * 2, 1.f, 0.f);

//Get radial alpha for snowland
float RadialAlpha = MapRangeClamped(Distance, SnowRadius, SnowRadius*2, 1.f, 0.f);

//Get ground type from color(Red = Tundra, Green = Forestland, Black = Grassland)
Type = GetBiomeFromColor(FVector(R, G, 0));

//Check if tile is within snow radius and if not if tile has MinSnowAlpha
if ((A+ RadialAddition)*RadialAlpha > MinSnowAlpha)
{
  //If so we want this to be Snowland (override the previous biome)
  Type = SNOWLAND;
}

//If we are snowland
if (Type == SNOWLAND)
{
  //And we are water then this water(lake) should become ice water
  if (Tile->GetGroundType() == WATER)
  {
      Type = ICEWATER;
   }
}

//If the tile type is already water and not ice water then we want to keep it water
if (Tile->GetGroundType() == WATER && Type != ICEWATER)
{
  Type = WATER;
}
//At the end mark this tile as the type
MarkTileAsType(Type, TileIndex);
}

Generating Flora

To create trees we again create one noise map per biome, this case 4 noise maps (Tundra, Woodland, Grassland and Snowland), each biome has different density and frequency for its trees.

For example, Woodland has dense clusters of trees while grassland has more single standing trees and tundra barely has trees.

For trees, we want to get the noise at the tile index and if it’s larger than a minimum alpha then that tile should become the tree for that tile.

Values I used for a 150 x 150 grid for the biomes:

  • Tundra -> 0.2 Frequency and 0.3 minimum alpha
  • Woodland -> 0.2 Frequency and 0.5 minimum alpha
  • Grassland -> 0.5 Frequency and 0.5 minimum alpha
  • Snowland -> 0.2 Frequency and 0.4 minimum alpha

The lower the frequency the larger the maximum size of a single cluster of trees, the higher the minimum alpha the fewer trees a single tree cluster will be.

Below an example of various woodland settings:

0.2 Frequency and 0.5 minimum alpha
0.1 Frequency and 0.5 minimum alpha
0.2 Frequency and 0.15 minimum alpha

Generating Resources

Creating resources is done with one noise map per resource, we have 3 (Stone, Iron, Gold). For every tile, we get the noise value per resource type and if the value is lower than the minimum alpha for that resource type we set it to 0 so it doesn’t spawn at that tile.

After that, we again use the 3 noise values as RGB data, and we select which is the strongest on that tile by selecting the highest value. If no value is larger than 0 then none of the noise maps has the required minimum alpha.

After it passed the minimum alpha check we do one other check with a weight value, if a random roll between 0-1 is larger than the weight for that resource it wins and can add itself to the level.

Resources are quite rare and would rarely spawn in clusters so we use very high frequency values and high minimum alphas. For our purpose, we have different weights per biome but the same minimum alpha globally (Gold can never spawn in grassland for example).

Values I used for the resources in Snowland biome:

  • Iron -> 50 Frequency and 0.8 weight
  • Stone -> 50 Frequency and 0.9 weight
  • Gold -> 50 Frequency and 0.7 weight

Here’s an example of a resource noise map. (R= Stone, G = Iron, B = Gold)

//Get Stone noise
const float fStoneNoise = StoneNoise->GetNoise2D(x / xGridSize, y / GridSize);
//Get Iron noise
const float fIronNoise = IronNoise->GetNoise2D(x / xGridSize, y / GridSize);
//Get Gold noise
const float fGoldNoise = GoldNoise->GetNoise2D(x / xGridSize, y / GridSize);

//Stone final value
float R = MapRangeClamped(fStoneNoise, -1.f, 1.f, 0.f, 1.f); 
 //Iron final value
float G = MapRangeClamped(fIronNoise, -1.f, 1.f, 0.f, 1.f);
//Gold  final value
float B = MapRangeClamped(fGoldNoise, -1.f, 1.f, 0.f, 1.f);                

if (R < StoneWeight) //Lower than weight, never pick
{
  R = 0.f;
}
if (G < IronWeight) //Lower than weight, never pick
{
  G = 0.f;
}
if (B < GoldWeight) //Lower than weight, never pick
{
  B = 0.f;
}

NewResourceType = NONE;

if (R > G && R > B) //Pick stone
{
  NewResourceType = STONE;
}
else if (G > R && G > B) //Pick iron
{
   NewResourceType = IRON;
}
else if (B > R && B > G) //Pick Gold
{
  NewResourceType = GOLD;
}

//Did we select a resource?
if (NewResourceType != ESGGridResourceType::GRT_NONE)
{
 if(RandomBoolWithWeightFromStream(GetBiomeResourceWeight(Biome,NewResourceType))
 {
    //Spawn the resource and set tile resource type to NewResourceType
    Tile->SetResourceType(NewResourceType);
  }
}

UVS and Materials

Obviously, after you generate the shape and setup all your biomes you need to be able to color your island. Based on the biome for every tile I select a different UV space per biome. I create a triangle shape in the UV per triangle located where the mask for that biome goes. Below is a picture of the UVs per biome.

To actually color/materialize the mesh I created a mask per location of a triangle and apply a different world aligned texture per mask and then combine everything into a single material end result. This is probably not optimal and can be improved but then again I’m not a material artist 🙂

Here’s an image of the material I created.