Highlights
In this assignment, you will implement a ranked querying system, run a set of sample queries, and evaluate system performance. You will need to design and implement appropriate data structures for efficient searching.
The aims of the assignment are to enhance your understanding of retrieval models and evaluation, and to give you practical experience with the implementation of search algorithms.
For this assignment, you should build on (by re-using and modifying where required) your indexing code from assignment 1. As usual, wherever code is re-used (including your own) this must be clearly identified in your submission.
The “Web Search Engines and Information Retrieval” Canvas contains further announcements and a discussion board for this assignment.
Learning Outcomes
This assessment relates to the following learning outcomes:
• CLO1: apply information retreival principles to locate relevant information in large collections of data
• CLO3: implement features of retrieval systems for web-based and other search tasks
• CLO4: analyse the performance of retrieval systems using test collections
• CLO5: make practical recommendations about deploying information retrieval sys- tems in different search domains, including considerations for document management and querying
Requirements:
• You must implement your programs in Java, C, C++, or Python. Your programs should be well written, using good coding style and including appropriate use of comments. Your markers will look at your source code, and coding style may form part of the assessment of this assignment.
• You must include a plain text file called “README.txt” with your submission. This file should include the name and student ID of all team members at the top. It needs to clearly explain how to compile and run your programs on (titan|saturn|jupiter).csit.rmit.edu.au. The clarity of the instructions and usability of your programs may form part of the assessment of this assignment.
• Your programs may be developed on any machine, but must compile and run on the course machines, (titan|saturn|jupiter).csit.rmit.edu.au. If your submission does not compile and run on these machines, it will not be marked.
• The submitted programs must be your own code. You should not use existing external libraries that implement advanced data structures. Where libraries (or in the case of scripting languages, built-in features beyond simple low-level data types) are used for data structures such as hash tables, they must be clearly attributed, and it is up to you to demonstrate a clear understanding of how the library is implemented in the discussion in your assignment report.
• Paths should not be hard-coded.
• Where your programs need to create auxiliary files, these should be stored in the current working directory.
• Please ensure that your submission follows the file naming rules specified in the tasks below. File names are case sensitive, i.e. if it is specified that the file name is gryphon, then that is exactly the file name you should submit; Gryphon, GRYPHON, griffin, and anything else but gryphon will be rejected.
Programming Tasks:
You will find all files needed for this assignment in the directory
/home/inforet/a2
on (titan|saturn|jupiter).csit.rmit.edu.au.
First, have a look at the file latimes. It is part of the TREC ad hoc retrieval test collection, and is comprised of 131896 newspaper documents.
You will need to write programs to perform searches on this data. As a preliminary step, you should therefore be able to index the data efficiently, for example using an inverted index. This will enable to you to easily access the various term occurrence statistics that you need to use for the searching tasks described below.
However, it is your responsibility to verify the accuracy of your indexing code.
Note: If you have concerns about the functionality of your indexing code from assignment 1, you may optionally make use of a provided inverted index dump file
index dump/invlist-TermFreq.txt
and document map file
index dump/map
If you choose to make use of these files, you will need to load them into memory and retrieve the term occurrence statistics using appropriate data structures. A README.txt file explaining the format is included in the same directory.
This IT Assignment has been solved by our IT Experts at onlineassignmentbank. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.
© Copyright 2026 My Uni Papers – Student Hustle Made Hassle Free. All rights reserved.