Abstract:
Duplicate Question Detection (DQD) in Programming Community Question & Answer (PCQA) platforms has been a highly prominent area of research in the recent past. A lot of studies use Semantic Text Similarity (STS) as a key mechanism for this concept. Yet, the use of STS introduces one major drawback, fast retrieval of data with efficient use of computational resources. The drawback is a cause of iteratively comparing a given query question with all the questions within the data source. This research paper presents a novel concept named StackO-DQD that combines STS with hashing to overcome the abovementioned. At the benchmarking stage, the results show an average increase of 1.73%, 6.52%, and 7.22% over the previous work in recommending the precise similar question within the top 5, top 10, and the top 20 results each.