Testing Fuzzy Search - Quick Reference
Prerequisites
Make sure the migration is run:
bundle exec rails db:migrate
Verify the extension is enabled:
bundle exec rails runner "puts ActiveRecord::Base.connection.extension_enabled?('pg_trgm')"
# Should output: true
Quick Tests in Rails Console
# Start console
bundle exec rails console
# 1. Test exact match (should work with or without trigram)
Listings::Job.full_text_search("cashier").count
# 2. Test with typo (this is where trigram shines)
Listings::Job.full_text_search("casier").count
# 3. Compare results
exact = Listings::Job.full_text_search("cashier")
typo = Listings::Job.full_text_search("casier")
puts "Exact match results: #{exact.count}"
puts "Typo match results: #{typo.count}"
# Should be similar or identical
# 4. Test similarity threshold
# Calculate how similar two words are
ActiveRecord::Base.connection.execute(
"SELECT word_similarity('cashier', 'full time casier position') as score"
).first
# => {"score"=>"0.444444"} (above our 0.2 threshold, so it matches)
# 5. Test various typos
test_queries = [
"cashier", # Exact
"casier", # Missing 'h'
"cashir", # Wrong last letter
"cahsier", # Transposed letters
"cshier", # Missing letter
"casheer", # Different vowel
"keshier" # Different first letter (might not match)
]
test_queries.each do |query|
count = Listings::Job.full_text_search(query).count
puts "#{query.ljust(15)} => #{count} results"
end
Testing Different Thresholds
If you want to experiment with different threshold values:
# Create a test scope with different threshold
class Listings::Job
pg_search_scope :strict_search,
against: [:company_name, :category_name, :title, :description, :description_summary],
using: {
tsearch: {
dictionary: 'english',
tsvector_column: :search_vector,
prefix: true
},
trigram: {
threshold: 0.4, # More strict
word_similarity: true
}
}
end
# Compare
Listings::Job.full_text_search("casier").count # Default (0.2)
Listings::Job.strict_search("casier").count # Strict (0.4)
Performance Testing
require 'benchmark'
# Test performance
Benchmark.ms do
Listings::Job.full_text_search("cashier").to_a
end
# Should be < 100ms for most queries
# Compare with LIKE query (don't use in production!)
Benchmark.ms do
Listings::Job.where("title LIKE ?", "%cashier%").to_a
end
# Usually slower than full_text_search
# Check query plan
Listings::Job.full_text_search("cashier").explain
# Should show index usage
Real-World Test Scenarios
# Scenario 1: User searches for common job titles with typos
[
{ query: "waiter", typo: "waiter" },
{ query: "waiter", typo: "waiter" },
{ query: "waiter", typo: "waiiter" },
{ query: "barista", typo: "barsta" },
{ query: "barista", typo: "barrista" }
].each do |test|
exact_count = Listings::Job.full_text_search(test[:query]).count
typo_count = Listings::Job.full_text_search(test[:typo]).count
puts "#{test[:query]} -> #{test[:typo]}: #{exact_count} vs #{typo_count} results"
end
# Scenario 2: Multi-word searches
Listings::Job.full_text_search("full time cashier").count
Listings::Job.full_text_search("ful time casier").count # With typos
# Scenario 3: Company name search with typos
Listings::Job.full_text_search("mcdonalds").count
Listings::Job.full_text_search("macdonalds").count # Common typo
# Scenario 4: Category search
Listings::Job.full_text_search("food service").count
Listings::Job.full_text_search("fod servce").count # With typos
Debugging Tips
Check if extension is working
# Test the similarity function directly
ActiveRecord::Base.connection.execute(
"SELECT similarity('cashier', 'casier')"
).first
# Should return a hash with 'similarity' key
# If you get an error, the extension isn't enabled
Check which columns are being searched
# Inspect the pg_search configuration
Listings::Job.pg_search_configuration[:full_text_search]
# Should show both tsearch and trigram configurations
Check if results are ranked correctly
# Get results with their ranking scores
results = Listings::Job.full_text_search("cashier")
results.each do |job|
puts "#{job.title} - Rank: #{job.pg_search_rank}"
end
# Higher scores = better matches
# Exact matches should have higher scores than fuzzy matches
Verify search_vector is populated
# Check if the tsvector column has data
job = Listings::Job.first
job.search_vector
# Should show something like: "'cashier':4 'experienc':3 'look':1"
# If nil, the search_vector needs to be populated
# (This should be handled by your data sync process)
Common Issues
Issue: Still getting no results with typos
Check:
# 1. Is the extension enabled?
ActiveRecord::Base.connection.extension_enabled?('pg_trgm')
# 2. Is the threshold too high?
# Try a lower threshold temporarily to test
# 3. What's the similarity score?
ActiveRecord::Base.connection.execute(
"SELECT word_similarity('casier', 'cashier') as score"
).first
# If score < 0.2, it won't match with default threshold
# 4. Are you searching the right columns?
# Check if the text exists in the 'against' columns
Listings::Job.where("title ILIKE ?", "%cashier%").count
Issue: Too many irrelevant results
Solution: Increase the threshold in the model:
trigram: {
threshold: 0.3, # Increase from 0.2 to 0.3
word_similarity: true
}
Issue: Queries are slow
Check:
# 1. Are indexes being used?
Listings::Job.full_text_search("cashier").explain
# Should show "Index Scan" not "Seq Scan"
# 2. How many rows are we searching?
Listings::Job.count
# If > 100k rows, consider adding trigram indexes
# 3. How many columns are in 'against'?
# Fewer columns = faster searches
Adjusting Configuration
If you need to adjust the configuration after testing:
- Edit
/Users/alaay/jod/repo/jodapp-api/app/domains/listings/job.rb - Change the threshold value
- Restart the Rails console
- Re-test
# Before
trigram: {
threshold: 0.2,
word_similarity: true
}
# After (example: more strict)
trigram: {
threshold: 0.3,
word_similarity: true
}
Success Criteria
✅ Your implementation is working if:
- Exact matches return results
- Typos (1-2 character differences) return similar results
- Results are ranked (exact matches higher than typos)
- Query time is < 100ms for most searches
- No false positives (completely unrelated results)
Next Steps After Testing
- Monitor in production: Track which search terms users actually use
- Analyze results: Are users getting relevant results?
- Tune threshold: Adjust based on user feedback
- Add analytics: Track search-to-click conversion
- Consider indexes: Add trigram indexes if queries are slow
Quick Reference
# Basic search
Listings::Job.full_text_search("query")
# With additional filters
Listings::Job.full_text_search("cashier").where(status: :open)
# With pagination
Listings::Job.full_text_search("cashier").page(1).per(20)
# Get ranked results
Listings::Job.full_text_search("cashier").with_pg_search_rank
# Order by rank
Listings::Job.full_text_search("cashier")
.with_pg_search_rank
.order("pg_search_rank DESC")