Kaggle Redesign

In this 3-Days solo design challenge, I aim to redesign the Kaggle dataset preview function, improving search and preview capabilities across multiple files. This redesign is focused on providing enhanced support for novice data analysts, facilitating quick access to answers for their questions.

Problem

Kaggle's current preview feature falls short when users attempt to find the information they need across multiple (>2) files.

Let's look at this YouTube video statistics dataset.

Using the current preview experience, here are data questions that cannot be easily answered:
1. Is there a video that appears in both the US & CA?
2. How many unique videos are there in Europe in total?

Compared to experienced data analysts, novices face more challenges.

Research shows that experienced data scientists did not get border for the data preview they do not mind since they have access to advanced tools. However, novice data analysts who lack knowledge or experience in using advanced analysis tools usually struggle the most.

Goal

HMV improve search and preview capabilities across multiple files to assist novice data analysts lacking advanced tool knowledge, helping them quickly find the information they need?

After gaining a better understanding of the design goal, I developed a user journey to identify the specific challenges novice data analysts face in the dataset preview.
Due to the this is a time constraints design challenge, my design approach involves taking a closer look at how competitors operate to draw inspiration.

I'll then evaluate how we can make the redesign of Kaggle's preview function stand out from the rest. Here are the questions I kept in mind during competitor analysis:

🤔

  • Do they provide dataset or file previews on their site?

  • How do they display and organize multiple files in the dataset or file preview?

  • Is their approach to multiple file previews user-friendly and conducive to exploring data and discovering insights?

After analyzing two competitors, I found that they have integrated data analysis preview tools, allowing users to perform easy data analysis directly on their site.

💡

In-build data analysis preview tool.

  1. Easy to use, neat design.
  2. Showing graphics.
  3. Allow user to optimize.

Users need base DA knowledge and skills.

Competitors didn't provide direct data/file summary apart from the data graphics.

All dataset files are visually preview at once.

kaggle: Only one file is expand; click "Data Explorer" to see others files.

Good:

  1. Quicker way to scan all the data.
  2. Easier to catch the insight than only one file.

But...

Still can’t find an insights directly if it need two files to analysis.

In-build data analysis preview.

Good:

  1. Quickly get insights thought simple data analysis tool.
  2. Easy to use & neat.

But...

Still require base knowledge of data analysis since the analysis configuration need manual input.

File is preview in content. Given necessary information, but set a high bar for novice DA users.

In-build auto-generated data analysis.

Good:

  1. Effortless get the insights.
  2. Less technical knowledge require.

But...

  1. Some dataset still need optimize to get the answer.
  2. No summary, only graphs.
I then crafted potential solutions based on the earlier insights.
1️⃣ Add auto-generated interactive visualization tools on data preview (Graphics + Summary).
Easy & efficient.

Allow users to get insights either from graphic or context summary.
⚠️Users might select the unrelated file, resulting in an unrelated answer - increasing cognitive load.

2️⃣Utilize AI chat for more goal-directed.
Most goal-directed.
Easy & efficient.
⚠️High cost.
⚠️When datasets contain many files, automatic identification and analysis can lead to prolonged processing times.

Considering Kaggle prioritizes user engagement and ease of use, and users are more comfortable with visualizations (85% think data preview is useful), solution 1 would be a better fit.

Since user engagement and ease of use are top priorities, solution 1 with auto-generated interactive visualization tools seems more suitable.

🤔

However, it's essential to address the risk of users selecting unrelated files. Implementing clear instructions or file selection safeguards could mitigate this issue.

The solution to address the risk of users selecting unrelated files can involve adding search keywords for datasets and providing file previews during analysis.
1️⃣ Add searching key worlds for dataset files.

Input the keywords on the search box.

System auto-highlights the files which involve keywords.

2️⃣Files previews during analysis.

Click the file.

Show the file details for preview.

After testing with potential users, here is the new journey map that users will use with the updated data preview function.
Next Steps

1. User Documentation:
Prepare user documentation, including guides, tutorials, or tooltips, to help users understand the new features and functionalities.

2. Gather Feedback:
Utilize the thumb up/down features on the analysis result page to collect user feedback helps in making continuous improvements.

Epilogue and reflection

Throughout this 3-day design challenge, I learned how to quickly identify a design solution and complete a design project independently. Due to the limited scope of research and time constraints, I acknowledge that the research and context were not comprehensive. If given additional time, I would conduct more user testing to gather feedback on the user experience, file selection process, and the effectiveness of the visualization tools. This feedback would be instrumental in making iterative improvements to the prototype. Additionally, I would refine the prototype based on the usability testing results, making necessary iterations to enhance areas that users found confusing or challenging.

More projects