AI/ML

Entropy Compass

Interactive noise and prompt manipulation interface for image generation.

Year :

2025

Role :

Full-stack Engineer

Tool :

TypeScript, JavaScript, Python

Project Duration :

1.5 months

Impact :

Reimagined latent space exploration in Stable Diffusion by creating an intuitive, gamified interface that demystifies the generative process, empowering users with creative control over AI image synthesis.

VIDEO :

A demo of the fully functional APP.

Background :

In the field of generative AI and image synthesis, Stable Diffusion models operate by manipulating images in latent space through processes of adding and removing noise. However, these internal mechanisms are typically hidden from users, limiting their ability to control and understand the image generation process creatively.

CHALLENGE :

Moreover, current AI image generation interfaces are unintuitive and lack meaningful creative workflows. While optimized efficiency, users feel a lack of visual intuition and connection.

Research Questions

How can we expose hidden steps of the generative process?
What interface paradigms best support latent space manipulation?
How does gamification impact user engagement?

We analyzed step that shows the user's actions, touchpoints, and emotional experience. This shows different interface approaches' pros and cons, and we concluded that the spatial exploration maximizes the interactivity and engagement of users for ultimate creativity.

Experiments

We are interested in exposing the “black box” - latent space in the diffusion model. We want to use the bidirectional control of “re-nosing” (DDIM inversion) and “denoising” as the vehicle of image generation.

Interaction Design :

We want to focus on "continuous play" to bring gamified elements into text-based image generation and latent space exploration. The left 2 images are for "1D canvas", involving horizontal and vertical movements. The right 2 images are for "2D canvas", involving 360-degree slingshot style interaction.

UX DESIGN :

We designed a few interfaces, but after comparing, the two shown below stood out. We selected Design 2 for its borderless canvas that eliminates visual constraints and creates a more immersive user experience. The refined aesthetic reduces visual noise, allowing users to focus entirely on exploring and interacting with generated images without unnecessary interface elements competing for attention.

Frontend :

Left window (1D canvas) focuses on the linear and procedural view for clarity of flow and process. The right window (2D canvas) provides a holistic view of all generated images, enhancing the gameplay experience.

Backend :

Multiple APIs are called in a single generation process, and new images are stored, mapped, and displayed back to the canvases.

Summary :

The project successfully reimagines latent space exploration through an intuitive, game-like interface that exposes Stable Diffusion's underlying processes. While the current implementation demonstrates the potential of direct manipulation and auto-generated prompts for creative control, there's room to improve the balance between power and accessibility.

Focus on expanding evaluation metrics, optimizing prompt generation and backend performance, implementing branching history for multiple creative paths, and developing better tools to visualize noise-image relationships. Consider adding collaborative features to enrich the creative exploration process.