SpellChecker API ব্যবহার করে Spelling Mistake Handle করা

Lucene তে Spell Checking এবং Suggestions - লুসিন (Lucene) - Java Technologies

377

Apache Lucene একটি শক্তিশালী তথ্য সন্ধান (search) লাইব্রেরি যা টেক্সট ডেটাবেস থেকে দ্রুত এবং দক্ষতার সাথে ডেটা অনুসন্ধান করতে সহায়তা করে। লুসিন বিভিন্ন ধরনের ফিচার সরবরাহ করে, যার মধ্যে একটি হলো SpellChecker API, যা ব্যবহারকারীর টাইপ করা ভুল বানান সনাক্ত করতে এবং সংশোধন করতে সাহায্য করে। এই ফিচারটি বিশেষভাবে টেক্সট অনুসন্ধানের ক্ষেত্রে কার্যকর, যেখানে বানান ভুল হওয়া সম্ভাবনা থাকে।

এই টিউটোরিয়ালে, আমরা SpellChecker API ব্যবহার করে কিভাবে স্পেলিং ভুলগুলি হ্যান্ডেল করা যায় এবং কীভাবে এটি আপনার লুসিন ইন্ডেক্সিং এবং অনুসন্ধান সিস্টেমে কার্যকরী হতে পারে, তা দেখাবো।

১. SpellChecker API Overview

SpellChecker API লুসিনের একটি গুরুত্বপূর্ণ উপাদান, যা আপনাকে ভুল বানান শনাক্ত এবং সংশোধন করতে সহায়তা করে। এটি Lucene's IndexSearcher এর সাথে সংযুক্ত হয়ে কাজ করে, এবং ব্যবহারকারীর দেওয়া শব্দের সাথে সম্ভাব্য সঠিক বানানগুলি তুলনা করে। এর মাধ্যমে আপনি একটি শব্দের সর্বোচ্চ মিল পাওয়া সম্ভাব্য শব্দগুলি ফেরত পেতে পারেন।

২. SpellChecker API Setup

লুসিনে SpellChecker API ব্যবহার করতে প্রথমে আপনাকে SpellChecker এবং Dictionary তৈরি করতে হবে। Dictionary হল একটি ইনডেক্সড ডেটাবেস যা শব্দগুলির তালিকা রাখে এবং সঠিক বানান উদ্ধারের জন্য ব্যবহৃত হয়।

Step 1: Maven Dependency (Lucene)

আপনাকে Lucene SpellChecker এবং অন্যান্য প্রয়োজনীয় ডিপেনডেন্সি pom.xml ফাইলে যুক্ত করতে হবে।

<dependencies>
    <!-- Apache Lucene Dependency -->
    <dependency>
        <groupId>org.apache.lucene</groupId>
        <artifactId>lucene-core</artifactId>
        <version>8.11.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.lucene</groupId>
        <artifactId>lucene-spellchecker</artifactId>
        <version>8.11.0</version>
    </dependency>
</dependencies>

এখানে, lucene-core এবং lucene-spellchecker ডিপেনডেন্সি যুক্ত করা হয়েছে।

৩. SpellChecker Setup Example

SpellChecker API ব্যবহারের জন্য প্রথমে আপনাকে একটি SpellChecker তৈরি করতে হবে, এবং এরপর Dictionary প্রস্তুত করতে হবে। Directory হলো সেই জায়গা যেখানে আপনি শব্দগুলি ইনডেক্স করে সংরক্ষণ করেন।

Step 2: SpellChecker এবং Dictionary তৈরি করা

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.document.Document;
import org.apache.lucene.spell.SpellChecker;
import org.apache.lucene.search.IndexSearcher;
import java.nio.file.Paths;

public class SpellCheckerExample {

    public static void main(String[] args) throws Exception {

        // Directory to store the index
        Directory directory = FSDirectory.open(Paths.get("index"));
        
        // Analyzer for indexing
        Analyzer analyzer = new StandardAnalyzer();
        
        // Create an IndexWriter to index the words
        IndexWriterConfig config = new IndexWriterConfig(analyzer);
        IndexWriter writer = new IndexWriter(directory, config);
        
        // Index some words for spellchecking
        addDocument(writer, "hello");
        addDocument(writer, "world");
        addDocument(writer, "lucene");
        addDocument(writer, "search");
        writer.close();
        
        // Create SpellChecker instance
        SpellChecker spellChecker = new SpellChecker(directory);
        
        // Load the dictionary into SpellChecker
        spellChecker.indexDictionary(new LuceneDictionary(directory, "content"));
        
        // Check spelling of a word
        String[] suggestions = spellChecker.suggestSimilar("luene", 5);
        System.out.println("Suggestions for 'luene':");
        for (String suggestion : suggestions) {
            System.out.println(suggestion);
        }
    }
    
    private static void addDocument(IndexWriter writer, String word) throws Exception {
        Document doc = new Document();
        doc.add(new TextField("content", word, Field.Store.YES));
        writer.addDocument(doc);
    }
}

Code Breakdown:

IndexWriter: এখানে আমরা IndexWriter ব্যবহার করে কিছু সাধারণ শব্দ ইনডেক্স করছি, যেমন "hello", "world", "lucene", "search"।
SpellChecker: এরপর, SpellChecker এর মাধ্যমে Directory থেকে শব্দের ডিকশনারি তৈরি করা হচ্ছে।
suggestSimilar(): suggestSimilar মেথডটি ব্যবহার করে luene শব্দের সম্ভাব্য সঠিক বানানগুলো পাওয়া যাচ্ছে। এতে top 5 suggestions প্রদান করা হবে।

৪. SpellChecker Example Execution

উপরের কোডটি রান করার পর যদি আপনি "luene" শব্দটি দিয়ে বানান পরীক্ষা করেন, তাহলে SpellChecker প্রস্তাব করবে lucene শব্দটি, কারণ এটি কাছাকাছি বানান বিশ্লেষণ করে।

এটি আউটপুট হিসেবে এমন কিছু সঠিক শব্দ দিতে পারে:

Suggestions for 'luene':
lucene
lune
lunar
lunch
luten

এখানে, SpellChecker 5টি সম্ভাব্য সঠিক শব্দ প্রস্তাব করেছে।

৫. Advanced Features of SpellChecker API

SpellChecker API ব্যবহার করার সময় আপনি কিছু উন্নত ফিচার ব্যবহার করতে পারেন:

Distance Calculation: SpellChecker ব্যবহার করে শব্দের মধ্যে দূরত্ব নির্ধারণ করা যায়, যেমন Levenshtein distance, শব্দের মিল নির্ধারণ করার জন্য।
Threshold: আপনি একটি থ্রেশহোল্ড নির্ধারণ করতে পারেন, যার মাধ্যমে শুধুমাত্র নির্দিষ্ট মিলের শব্দগুলি দেখানো হবে।
Dictionary Customization: আপনি আপনার নিজের কাস্টম ডিকশনারি তৈরি করে SpellChecker এ ইনডেক্স করতে পারেন, যেমন বিশেষ ডোমেইন বা বিশেষ ক্ষেত্রের শব্দ।

সারাংশ

SpellChecker API ব্যবহার করে লুসিনে spelling mistake হ্যান্ডল করা একটি কার্যকরী পদ্ধতি যা ব্যবহারকারীর বানান ভুলের সংশোধন করতে সহায়তা করে। এটি Lucene Indexing এবং IndexSearcher এর সাথে যুক্ত হয়ে কাজ করে, এবং টাইপিং ভুলগুলো শনাক্ত করে সঠিক শব্দের প্রস্তাব দেয়। আপনার অনুসন্ধান ইন্টারফেসে বানান সংশোধন করার জন্য এটি অত্যন্ত সহায়ক হতে পারে, বিশেষত যেখানে ব্যবহারকারীরা ভুল বানান দিয়ে অনুসন্ধান চালায়।

Content added By

Md Zahid Hasan

Suggestion এবং Auto-Complete System তৈরি করা Practical উদাহরণ: Lucene দিয়ে Spell Checking এবং Suggestions তৈরি করা

SpellChecker API ব্যবহার করে Spelling Mistake Handle করা

১. SpellChecker API Overview

২. SpellChecker API Setup

Step 1: Maven Dependency (Lucene)

৩. SpellChecker Setup Example

Step 2: SpellChecker এবং Dictionary তৈরি করা

Code Breakdown:

৪. SpellChecker Example Execution

৫. Advanced Features of SpellChecker API

সারাংশ

Promotion

Satt AI

Hi, আমি SATT AI!

SpellChecker API ব্যবহার করে Spelling Mistake Handle করা

১. SpellChecker API Overview

২. SpellChecker API Setup

Step 1: Maven Dependency (Lucene)

৩. SpellChecker Setup Example

Step 2: SpellChecker এবং Dictionary তৈরি করা

Code Breakdown:

৪. SpellChecker Example Execution

৫. Advanced Features of SpellChecker API

সারাংশ

All Notifications

Promotion

Satt AI

Hi, আমি SATT AI!