Skip to content

A peaceful morning

Picture yourself as a curator in a museum.

One peaceful morning, your museum receives a large crate of archives from Ancient Greece.

ancient_greece

You are tasked with organizing and displaying these archives in a way that is both informative and coherent to visitors.

How would you go about this? Grouping by style? By time period? By subject matter? And to make things more complicated, how many groups should you divide the archives into? Suddenly, the morning is not so peaceful anymore.

As you might have noticed, there is no right answer for this. And even if you find the optimal way to do this, would that approach work for another crate with a completely different set of archives? Probably not.

Topic Modeling

This is where the concept of Topic Modeling comes in to restore the peacefulness of your morning.

Suppose you could feed the archives into a machine, and it would automatically group them into coherent topics. Well, recent techniques in Natural Language Processing, such as BERTopic, have made this possible leveraging the power of transformers. Problem solved then, right?

Not so fast! There is, however, no silver bullet for how to use these tools, my dear curator. No oracle can tell you the best way to group your archives. Natural language is fuzzy, and what constitutes a good topic for you might not be the same for someone else.

Clusview

And so, we arrive at Clusview. A tool designed to ease the process of exploring and understanding the topics with an interactive approach. So you can decide what is the best way to group your archives.

Learn how to install Clusview down below, or kickstart your topic modeling journey with these quicklinks.

Installation

Currently, there is no hosted solution for Clusview. The easiest way to use Clusview is to import it locally in your projects.

  1. Clone the Clusview repository from GitHub.

    Terminal window
    git clone https://github.com/gcalcedo/clusview.git
  2. In your own project, install the core package from the path of the Clusview repository you just cloned.

    Terminal window
    pip install path_to_clusview/packages/core/
  3. Done! Clusview components can now be imported into your Python scripts.

    from clusview.samplers.parameters.linear_sampler import LinearSampler
    sampling = LinearSampler("x", 0, 10, 6).sample_range()
    print(sampling)
    >>> [0, 2, 4, 6, 8, 10]