Detecting novelty

From 2nd Book
Revision as of 18:06, 8 January 2025 by Pig (talk | contribs) (Created page with "Detecting novelty in a session depends on the context. Here are some methods commonly used in various applications: === 1. '''Novelty in Data Streams''' === * '''Change Detection Algorithms''': ** '''CUSUM (Cumulative Sum)''': Monitors for changes in the mean of a signal. ** '''EWMA (Exponentially Weighted Moving Average)''': Detects shifts in data trends. ** '''Page-Hinkley Test''': Focuses on detecting changes in the distribution. * '''Outlier Detection''': ** Use st...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Detecting novelty in a session depends on the context. Here are some methods commonly used in various applications:

1. Novelty in Data Streams

  • Change Detection Algorithms:
    • CUSUM (Cumulative Sum): Monitors for changes in the mean of a signal.
    • EWMA (Exponentially Weighted Moving Average): Detects shifts in data trends.
    • Page-Hinkley Test: Focuses on detecting changes in the distribution.
  • Outlier Detection:
    • Use statistical methods (e.g., z-scores or IQR) to flag unusual data points.
    • Machine learning models like isolation forests or one-class SVMs can detect outliers indicative of novelty.

2. Novelty in Text or Language

  • Natural Language Processing (NLP):
    • Vector Space Comparison: Calculate the similarity of the current session's content to a pre-existing corpus using cosine similarity or Euclidean distance. A low similarity score indicates novelty.
    • Topic Modeling: Use models like Latent Dirichlet Allocation (LDA) to identify new topics or themes.
  • Semantic Changes:
    • Use embeddings (e.g., Word2Vec, BERT) to determine whether the semantic content deviates significantly from previous data.

3. Novelty in Behavior

  • Behavioral Modeling:
    • Compare current actions to established patterns using Hidden Markov Models (HMMs) or Dynamic Time Warping (DTW).
    • Anomalies in patterns, timing, or sequences can indicate novelty.

4. Novelty in Visual Data

  • Computer Vision:
    • Use convolutional neural networks (CNNs) or autoencoders to identify patterns that differ from known visual data.
    • Generative models like Variational Autoencoders (VAEs) or GANs (Generative Adversarial Networks) can flag inputs outside their learned distribution.

5. Novelty in User Sessions

  • Interaction Analysis:
    • Track and compare session-level features such as time spent, clicks, queries, or navigation paths. Deviations from the norm suggest novelty.
  • Context-Aware Models:
    • Contextual bandits or reinforcement learning can detect when a session presents inputs or behaviors requiring a novel response.

6. Novelty in Context or Task

  • Knowledge Base Comparison:
    • Match session data against a known context (e.g., using a knowledge graph or rule-based system). If new entities or relationships appear, it may be novel.
  • Unexpected Queries:
    • Identify whether the session introduces new topics, tasks, or questions outside the system's expected scope.

Would you like to explore novelty detection in a specific domain or use case?