VerifyMe

- Learned Embedding Space Interactive Visualisation

Description

This graph contains 4,493, ≈10K character samples written by 995 authors from The Project Gutenberg Corpus.

Each point represents a text passage that was cleaned, converted to a stylometric embedding, then normalized using z-scores by feature and percentile-normalized using corpus statistics, processed through EmbeddingNet (the transformer-encoder-based model within my siamese network), and finally reduced from 128 to 3 dimensions using UMAP.

Project code

Note: Click the dropdown to colour samples by author.