An NSFW Filter for Marginalia Search
5 hours ago
- #Neural Network
- #NSFW Filter
- #Search Engine
- The author developed an NSFW filter for Marginalia Search using a single hidden layer neural network, as other methods like Fasttext produced too many false positives.
- Training data was generated by using an LLM (ollama/qwen) to classify search results as SAFE or NSFW, avoiding manual labeling but resulting in skewed samples due to NSFW search terms.
- The neural network uses handpicked features (e.g., 'cum', 'balls') and disambiguating terms (e.g., 'laude', 'golf') to reduce false positives, though balancing accuracy and false positives remains challenging.
- Implementation involves forward propagation and backpropagation with gradient descent, using ReLU and sigmoid activation functions, and binary cross-entropy loss.
- The filter is currently available via the API, with future UI plans, and shows ~90% accuracy in evaluations, but practical false positive rates are higher due to low NSFW base rates.