Detecting novelty
From 2nd Book
Jump to navigationJump to search
Detecting novelty in a session depends on the context. Here are some methods commonly used in various applications:
1. Novelty in Data Streams
- Change Detection Algorithms:
- CUSUM (Cumulative Sum): Monitors for changes in the mean of a signal.
- EWMA (Exponentially Weighted Moving Average): Detects shifts in data trends.
- Page-Hinkley Test: Focuses on detecting changes in the distribution.
- Outlier Detection:
- Use statistical methods (e.g., z-scores or IQR) to flag unusual data points.
- Machine learning models like isolation forests or one-class SVMs can detect outliers indicative of novelty.
2. Novelty in Text or Language
- Natural Language Processing (NLP):
- Vector Space Comparison: Calculate the similarity of the current session's content to a pre-existing corpus using cosine similarity or Euclidean distance. A low similarity score indicates novelty.
- Topic Modeling: Use models like Latent Dirichlet Allocation (LDA) to identify new topics or themes.
- Semantic Changes:
- Use embeddings (e.g., Word2Vec, BERT) to determine whether the semantic content deviates significantly from previous data.
3. Novelty in Behavior
- Behavioral Modeling:
- Compare current actions to established patterns using Hidden Markov Models (HMMs) or Dynamic Time Warping (DTW).
- Anomalies in patterns, timing, or sequences can indicate novelty.
4. Novelty in Visual Data
- Computer Vision:
- Use convolutional neural networks (CNNs) or autoencoders to identify patterns that differ from known visual data.
- Generative models like Variational Autoencoders (VAEs) or GANs (Generative Adversarial Networks) can flag inputs outside their learned distribution.
5. Novelty in User Sessions
- Interaction Analysis:
- Track and compare session-level features such as time spent, clicks, queries, or navigation paths. Deviations from the norm suggest novelty.
- Context-Aware Models:
- Contextual bandits or reinforcement learning can detect when a session presents inputs or behaviors requiring a novel response.
6. Novelty in Context or Task
- Knowledge Base Comparison:
- Match session data against a known context (e.g., using a knowledge graph or rule-based system). If new entities or relationships appear, it may be novel.
- Unexpected Queries:
- Identify whether the session introduces new topics, tasks, or questions outside the system's expected scope.
Would you like to explore novelty detection in a specific domain or use case?