About the Sample Data

This dataset is provided for applicants to the Data Scientist role at Slant. It simulates the type of data a data scientist might work with in our day-to-day analysis.

All transactions, brands, dates, and amounts have been fully simulated and shuffled — no real customer data is included. However, the structure, patterns, and complexity of the data are realistic and representative of the datasets a successful analyst at Slant would regularly engage with.

The sample is designed to reflect a random selection of 5,000 customers from each of October, November, and December for the years 2023 and 2024.

Your task is to apply the tools and thinking of a data scientist to interrogate the data and surface insights you find compelling. Focus on analytical rigour and the clarity of your reasoning rather than presentation polish or complex visualisations.

For example, you may choose to:

  • Develop features that capture user-level behavioural patterns, such as spending frequency, category preferences, merchant concentration, or time-based trends.
  • Use appropriate unsupervised learning techniques (e.g. clustering algorithms like K-means or DBSCAN) to identify groups of customers with similar behaviours.
  • Reflect on what differentiates these customer segments and what potential business value could be derived from understanding these patterns.

You can submit your workbook or other materials join-slant@slantresearch.com

To receive the data you must first validate your email. Click the button and complete the form details, then check your email (it may go to spam).

If you have any problems accessing the data, please email info@slantresearch.com