Turkish Journal of Electrical Engineering and Computer Sciences

Privacy preserving scheme for document similarity detection

Abstract

The problem of detecting similar documents plays an essential role for many real-world applications, such as copyright protection and plagiarism detection. To protect data privacy, the new version of such a problem becomes more challenging, where the matched documents are distributed among two or more parties and their privacy should be preserved. In this paper, we propose new privacy-preserving document similarity detection schemes by utilizing the locality-sensitive hashing technique, which can handle the misspelled mistakes. Furthermore, the keywords' occurrences of a given document are integrated into its underlying representation to support a better ranking for the returned results. We introduced a new security definition, which hides the exact similarity scores towards the querying party. Extensive experiments on real-world data illustrate that our proposed schemes are efficient and accurate.

DOI

10.55730/1300-0632.3801

Keywords

Document similarity, local sensitive hashing, multiparty computing, privacy preserving

First Page

609

Last Page

628

Recommended Citation

ABDULSADA, A, AL-DARRAJI, S, & HONI, D (2022). Privacy preserving scheme for document similarity detection. Turkish Journal of Electrical Engineering and Computer Sciences 30 (3): 609-628. https://doi.org/10.55730/1300-0632.3801

Download

Included in

Computer Engineering Commons, Computer Sciences Commons, Electrical and Computer Engineering Commons

COinS

Turkish Journal of Electrical Engineering and Computer Sciences

Privacy preserving scheme for document similarity detection

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Issues by Year

Search

Turkish Journal of Electrical Engineering and Computer Sciences

Privacy preserving scheme for document similarity detection

Authors

Abstract

DOI

Keywords

First Page

Last Page

Recommended Citation

Included in

Share

Issues by Year

Search