5 tools for generating random or fictitious data (Fake Data Generator)

Uzan Muyumba Benjamin
5 min readApr 18, 2024

--

Generate data for your tests and demonstrations in less than 3 minutes

Photo by Edge2Edge Media on Unsplash

Whether for educational or private reasons, when you’re working on a project that requires a dataset before a demonstration to present new features to customers, so that they understand them better, or for testing software applications, and you can’t find a suitable, specific dataset anywhere. Of course, you don’t want to enter all the data using SQL queries or inserting values in each Excel cell, or settle for a dataset that doesn’t give the results and observations expected during the test?

So what should we do? Here are the 3 solutions available to us:

Firstly, make do with the dataset that doesn’t meet our needs (don’t do that!).

Secondly, do the dirty work by inserting 1000 rows of 50 columns, i.e. 5000 cells, and waste your time, even if the good side is instructive.

And finally, the best solution is to do the dirty work in a few seconds and generate a fictitious dataset that comes as close as possible to what we want!

But first, what is a fake data generator? It’s a tool for generating fictitious, but realistic test data including: a fake or random postal address, colors, country, city, credit card, date and time, gender, ID number, money numbers, random people’s names, random e-mail address, etc., in a file. It is designed to help developers and testers generate test data for software applications.

Here are the 5 fake data generators I recommend. They are listed from the most basic to the most complete:

1. Cobbl

Screenshot from Cobbl.io

Cobbl.io is simple and totally free, and its aim is to “make it as simple as possible to bring your projects to life with realistic (but fake!) data.“ It is the easiest to use and offers a maximum of 10,000 lines that can be generated.

1.1 — Pros

  • 12 categories of random data
  • 91 data type options
  • A maximum of 10,000 lines
  • Very easy to use (user-friendly, ergonomic interface)

1.2 — Cons

  • 3 export formats (CSV, JSON, JSONL)
  • It offers no control over the data generated
  • No project manager

2. AutoTestData

Screenshot from AutoTestData

AutoTestData is an easy-to-use data generator. It offers more export formats than its predecessor and a maximum of 100,000 lines.

2.1 - Pros

  • 12 export formats (CSV, JSON, Excel, SQL, XML and some programming languages)
  • 15 categories of random data
  • 40 options for customizing data types
  • Can generate up to 100 lines in its demo version
  • Offers control over the data generated
  • Easy to use
  • Includes a user tutorial

2.2 - Cons

  • Create an account (free of charge) to generate up to 100,000 lines
  • No project manager

3. OnlineDataGenerator

Screenshots from the demo page OnlineDataGenerator

OnlineDataGenerator is a free data generator, 100,000 rows can be generated and, unlike its predecessors, it includes a project manager.

3.1 — Pros

  • 5 export formats (CSV, JSON, Excel, SQL and XML)
  • 13 categories of random data
  • 90 options for customizing data types
  • Can generate up to 1,000 lines in its demo version
  • Offers control over the data generated

3.2 — Cons

  • Requires you to create an account (free of charge) to generate up to 100,000 lines and to have access to a project manager

4. GenerateData

Screenshot from GenerateData

GenerateData is my favorite random data generator, even though it has the biggest disadvantage: in addition to offering many more options than the previous ones, it’s simple and open-source.

4.1 — Pros

  • 15 export formats, divided into 2 categories, the first for databases and the second for parameterized data structures in 7 programming languages.
  • 34 random data categories
  • Offers greater control over the data generated
  • Preview code and data to be generated
  • It’s open-source

4.2 — Cons

  • Ask for a subscription to increase to more than 500 lines
  • Request a subscription to access the project manager
  • Not many customization options
  • Unintuitive handling

5. Mockaroo

Screenshot from Mockaroo

Mockaroo is the most complete random data generator, allowing you to generate data in a wide range of export formats, apart from programming languages. Its data control is more efficient. It includes an interesting project manager that can be linked to a GitHub account.

5.1 — Pros

  • 11 export formats (CSV, JSON, SQL, Excel, XML…)
  • 13 random data categories that can be customised
  • 175 data type customization options
  • Creating data types to be generated
  • Using Regex regular expressions in data control
  • Sequencing of unique fields or values (effective for numeric primary keys)
  • Columns can be calculated using Ruby syntax (column = a + b).
  • A maximum of 1000 lines is generated and a project manager in the free version.
  • Etc…

5.2 — Cons

  • Requires subscription to increase to over 1000 lines
  • Unintuitive handling
  • Not enough export formats

Conclusion

In the end, from the simplest to the most complete, it’s best to choose the one that suits you best and meets your expectations.

I discovered these tools during my last year of academic studies in software engineering. I hope they will be of great use to you, as they were to me then and are now 😉

Thank you very much!

--

--