Artifact Evaluation - ATVA 2024, Kyoto, Japan

The purpose of an artifact is to support the claims made in the paper and allow other researchers to check the data sets, obtain the results, check their validity, and potentially also reuse the benchmarks and the implemented tools in their own research. In case of tool papers, the purpose additionally is to provide a working version of your tool that is easy to obtain and use.

Reviewer Instructions

All AE reviewers, please pay attention to the Reviewer Instructions which can be found here.

Author Instructions

ATVA 2024 will include two separate rounds of artifact evaluation:

For tool papers, the artifact evaluation is mandatory and the artifact must be submitted shortly by the submission of the paper (by May 2). The submitted artifact must obtain at least the functional badge; if it does not, the paper will be rejected. The reviews of the artifact will be available to the reviewers of the paper and can affect its acceptance.
For research papers, the artifact evaluation is voluntary and the authors will be invited to submit an artifact only after their paper is accepted. If the authors decide to submit an artifact, they must submit it by June 25. The reviews of the artifact and the obtained badges will not influence the acceptance of the paper.

Important Dates

For Tool Papers

Artifact submission: May 9
Smoke test reviews: May 17
Submission of revised artifacts: May 21 (only for issues identified during the smoke test)
Author notification: Jun 26

For Regular Papers

Artifact submissions: July 2
Smoke test reviews: ~~July 9~~ July 15
Submission of a revised artifacts: ~~July 11~~ (only for issues identified during the smoke test) July 18
Author notification: August 17

Evaluation Criteria

The reviewers will read the paper and evaluate the submitted artifact. The evaluation criteria are based on the ACM guidelines. According to the guidelines, each artifact can obtain three badges:

functional — The artifact can be executed without major problems, and its content is relevant to the paper. It supports ideally all claims made in the paper, and it can be used to reproduce ideally all of the experimental results. Moreover, the artifact contains instructions for setting the artifact up, running the experiments, and processing their results, which other researchers can follow.
available — The artifact has been uploaded to a publicly available archival repository (e.g., Zenodo, figshare), is available under DOI, and all its dependencies are either included in the artifact or also available (e.g., the dependencies are not downloaded from GitHub during setup). We strongly encourage aiming for the available badge unless you have a strong reason for not doing so, as it supports reproducible research.
reusable — The artifact is functional and its quality significantly exceeds minimal functionality. In particular, the tools are documented and can be used outside of the artifact. The users should be able to run the tools on other inputs and reuse them for their own applications and research.

Evaluation Process

An evaluation of a single artifact consists of two phases: a smoke test phase and a full-review phase.

In the smoke test phase, the reviewers will run a quick check of your artifact. During the smoke test, the reviewers only check whether the artifact works on a technical level (i.e., the archive contains all dependencies, the scripts can be executed without errors and generate an expected output). The reviewers will not check for logical errors of the implemented tool/scripts and whether the artifact is consistent with the paper. Your artifact should contain instructions that describe how to run the smoke test and what are the expected outputs. If any technical issues are identified during the smoke test, you will have a chance to submit a fixed version of the artifact.

In the full-review phase, the reviewers will evaluate the artifact in its entirety and check whether the obtained results are consistent with the claims of the paper.

Submission

Artifacts are submitted to the ATVA 2024 Artifact Evaluation track via EasyChair:

https://easychair.org/conferences?conf=atva2024

The submission consists of

An abstract containing:
- A short description of the artifact and its relationship to the paper
- Any special requirements for the evaluation (large number of CPU cores, specific GPU, etc), if applicable, and
- Text that explains why you believe that your artifact satisfies the criteria of reusable badge, if you are aiming for it.
Publicly available URL from which the artifact can be downloaded (a single ZIP file, see below)
SHA256 checksum of the ZIP file
A PDF of your submitted paper

For the URL, we recommend using an archival repository such as Zenodo. Firstly, this will ensure that the artifact will be available when the reviewers want to download it. Secondly, it will automatically grant you the available badge. If you use any other custom service (e.g., a website of your institution), you are responsible for its availability throughout the evaluation period.

The SHA256 checksum can be obtained by the following commands:

Linux: sha256sum <file>
Windows: CertUtil -hashfile <file> SHA256
MacOS: shasum -a 256 <file>

Artifact Guidelines

The submitted artifact must be packaged as a single ZIP file that contains:

LICENSE file that allows the reviewers to evaluate the artifact. In particular, it must allow the reviewers to use, execute, and modify the contents of the artifact.
README file that describes the contents of the artifact. If the artifact contains executable code, the README should contain the instructions on how to install the artifact and use it (see below).
The actual content of the artifact.

If your artifact contains executable files, you cannot expect that the reviewers and the other researchers will be executing it on their actual computers. Therefore, we require that the submitted artifact either works in the provided virtual machine based on Ubuntu 22.04 (hereafter the VM), or contains a Docker image in which the artifact can be executed together with a Dockerfile that can be used to build the image.

To achieve better reusability, submit both the Dockerfile and the generated Docker image, if your artifact is based on Docker. If it is based on the VM, do not submit the modified VM, submit the files that are necessary for setting the provided VM up. Namely, the artifact should contain 1) files that the users copy to the VM and 2) instructions that describe how to install all the dependencies into the VM and how to set it up. This ensures that all dependencies and setup are well documented in the artifact.

Example Template for Docker-based Artifact

We encourage authors to submit Docker-based artifacts. An example for a Docker-based artifact can be found here.

Virtual Machine

If a Docker-based artifact is not possible, we also provide a VM image, based on Ubuntu 22.04 LTS, that can be downloaded from Zenodo. This VM will be used by reviewers as evaluation environment. Please make sure that your submitted artifact is self-contained and includes all required dependencies to run within the VM. Ideally, the artifact is exercisable in the VM without internet connection.

README file

The README file should contain the following information:

Description of the content of the artifact. What is the purpose of the artifact? What claims of the paper it replicates? What is the structure of the artifact?
Installation instructions. What should be copied to the VM? What commands should be executed in the VM to install all the dependencies? What commands should be executed to build the Docker image and run the Docker container?
Instructions for the smoke test. What commands should the reviewers run? What are the expected outputs? What is the expected time needed to obtain the results?
Instructions for the full evaluation. What commands should be executed to replicate the results of the paper? What are the expected results? Which part of the output correspond to which parts of the paper? What is the expected time needed to obtain the results?

Some additional guidelines that will help you to make the artifact more user-friendly and reusable for the future researchers:

Do not expect any technical expertise from the users.
Keep the usage simple through easy-to-use-scripts.
Make the output of the scripts easy to interpret. If you can, format the output as a table or generate a plot (ideally in a similar format as in the paper).
Do not assume particular path in the VM to which the artifact was copied. If you do assume specific paths, write it in the README.
If you do not have to, do not assume internet connection. If you have any external dependencies (e.g., Ubuntu packages), include them in the artifact. This can be achieved by including the file package.deb in the artifact and running sudo dpkg -i package.deb during the installation. If there are objective reasons why the internet connection is preferred or necessary (e.g., the benchmark set is too large, some dependencies cannot be included in the artifact due to their license restrictions), explain this in the README.
State resource requirements and/or the environment in which you successfully tested the artifact.
If the execution of your artifact takes a significant amount of time (more than four hours), provide a smaller version of the experiments that can be executed in one hour. You can either use only a representative subset of the benchmarks and/or use a shorter timeouts. It both cases, also provide log files from your complete evaluation that you used in the paper and also include scripts to run the complete set of experiments for reviewers who have sufficient amount of time.

Artifact Evaluation Co-Chairs

If you have any questions or the situation of your artifact is nonstandard (e.g., requires special hardware, proprietary software, etc.), feel free to contact the Artifact Evaluation Committee chairs:

Martin Jonáš, Masaryk University (martin.jonas@mail.muni.cz)
Mathias Preiner, Stanford University (preiner@cs.stanford.edu)
Yoni Zohar, Bar-Ilan University (yoni.zohar@biu.ac.il)