Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLVM IR language module #1233

Merged
merged 25 commits into from
Aug 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
15b3fb1
Add llvmir language module structure.
NiklasHeneka Jul 1, 2023
b005721
Add first draft of the simple version of the LLVM IR language module.
NiklasHeneka Jul 5, 2023
cf112c8
Refactor LLVM IR module and fix several issues.
NiklasHeneka Jul 6, 2023
0459797
Add new token types to the LLVM IR language module.
NiklasHeneka Jul 9, 2023
0c6c027
Change token types and extraction of the LLVMIR module.
NiklasHeneka Jul 14, 2023
b75caa8
Fix a failing test and a mistake in the LLVMIR antlr grammar.
NiklasHeneka Jul 15, 2023
3d4b211
Create separate tokens for operations.
NiklasHeneka Jul 16, 2023
03c11f5
Add new token types to the LLVMIR language module.
NiklasHeneka Jul 19, 2023
67feef7
Add token types for constants and constant expressions.
NiklasHeneka Jul 21, 2023
5186b41
Add draft for test file.
NiklasHeneka Jul 21, 2023
6c1396c
Remove unnecessary listener methods and add tokens for switch stateme…
NiklasHeneka Jul 24, 2023
e6409fe
Add first code to the test file.
NiklasHeneka Jul 24, 2023
083c09f
Change some token types and add listener methods.
NiklasHeneka Jul 24, 2023
657e698
Rename some token types and improve test file.
NiklasHeneka Jul 27, 2023
4530eda
Improve token extraction and remove unnecessary listener methods.
NiklasHeneka Aug 1, 2023
568f6a3
Merge LLVM IR module into develop branch.
NiklasHeneka Aug 3, 2023
bbcdfdf
Update LLVM IR language module.
NiklasHeneka Aug 4, 2023
5a8baa1
Improve token abstraction.
NiklasHeneka Aug 9, 2023
1b17ea8
Fix tests and README.
NiklasHeneka Aug 9, 2023
cd5a730
Improve tests, add maven javadoc plugin and update README.
NiklasHeneka Aug 10, 2023
865ee0e
Fix format of the README table and update LLVMIR Antlr grammar.
NiklasHeneka Aug 11, 2023
0ab04b7
Fix JavaDoc issue.
NiklasHeneka Aug 15, 2023
41d6856
Improve language module test and add import statement.
NiklasHeneka Aug 17, 2023
71a6a18
Remove the snapshot version from the pom.xml file.
NiklasHeneka Aug 18, 2023
e8fbd1c
Add JavaDoc and some minor fixes.
NiklasHeneka Aug 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 18 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,23 @@ JPlag is a system that finds similarities among multiple sets of source code fil
In the following, a list of all supported languages with their supported language version is provided. A language can be selected from the command line using subcommands (jplag [jplag options] <language name> [language options]). Alternatively you can use the legacy "-l" argument.

| Language | Version | CLI Argument Name | [state](https://github.com/jplag/JPlag/wiki/2.-Supported-Languages) | parser |
|--------------------------------------------------------|--------:|-------------------|:----------------------------------------------------------------:|:---------:|
| [Java](https://www.java.com) | 17 | java | mature | JavaC |
| [C/C++](https://isocpp.org) | 11 | cpp | legacy | JavaCC |
| [C/C++](https://isocpp.org) | 14 | cpp2 | beta | ANTLR 4 |
| [C#](https://docs.microsoft.com/en-us/dotnet/csharp/) | 6 | csharp | beta | ANTLR 4 |
| [Go](https://go.dev) | 1.17 | golang | beta | ANTLR 4 |
| [Kotlin](https://kotlinlang.org) | 1.3 | kotlin | beta | ANTLR 4 |
| [Python](https://www.python.org) | 3.6 | python3 | legacy | ANTLR 4 |
| [R](https://www.r-project.org/) | 3.5.0 | rlang | beta | ANTLR 4 |
| [Rust](https://www.rust-lang.org/) | 1.60.0 | rust | beta | ANTLR 4 |
| [Scala](https://www.scala-lang.org) | 2.13.8 | scala | beta | Scalameta |
| [Scheme](http://www.scheme-reports.org) | ? | scheme | unknown | JavaCC |
| [Swift](https://www.swift.org) | 5.4 | swift | beta | ANTLR 4 |
| [EMF Metamodel](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf | beta | EMF |
| [EMF Model](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf-model | alpha | EMF |
| Text (naive) | - | text | legacy | CoreNLP |
|--------------------------------------------------------|--------:|-------------------|:-------------------------------------------------------------------:|:---------:|
| [Java](https://www.java.com) | 17 | java | mature | JavaC |
| [C/C++](https://isocpp.org) | 11 | cpp | legacy | JavaCC |
| [C/C++](https://isocpp.org) | 14 | cpp2 | beta | ANTLR 4 |
| [C#](https://docs.microsoft.com/en-us/dotnet/csharp/) | 6 | csharp | beta | ANTLR 4 |
| [Go](https://go.dev) | 1.17 | golang | beta | ANTLR 4 |
| [Kotlin](https://kotlinlang.org) | 1.3 | kotlin | beta | ANTLR 4 |
| [Python](https://www.python.org) | 3.6 | python3 | legacy | ANTLR 4 |
| [R](https://www.r-project.org/) | 3.5.0 | rlang | beta | ANTLR 4 |
| [Rust](https://www.rust-lang.org/) | 1.60.0 | rust | beta | ANTLR 4 |
| [Scala](https://www.scala-lang.org) | 2.13.8 | scala | beta | Scalameta |
| [Scheme](http://www.scheme-reports.org) | ? | scheme | unknown | JavaCC |
| [Swift](https://www.swift.org) | 5.4 | swift | beta | ANTLR 4 |
| [EMF Metamodel](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf | beta | EMF |
| [EMF Model](https://www.eclipse.org/modeling/emf/) | 2.25.0 | emf-model | alpha | EMF |
| [LLVM IR](https://llvm.org) | 15 | llvmir | beta | ANTLR 4 |
| Text (naive) | - | text | legacy | CoreNLP |

## Download and Installation
You need Java SE 17 to run or build JPlag.
Expand Down Expand Up @@ -151,6 +152,7 @@ Commands:
go
java
kotlin
llvmir
python3
rlang
rust
Expand Down
5 changes: 5 additions & 0 deletions cli/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,11 @@
<artifactId>emf-model</artifactId>
<version>${revision}</version>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>llvmir</artifactId>
<version>${revision}</version>
</dependency>
<!-- CLI -->
<dependency>
<groupId>org.kohsuke.metainf-services</groupId>
Expand Down
2 changes: 1 addition & 1 deletion cli/src/test/java/de/jplag/cli/LanguageTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ void testInvalidLanguage() {
@Test
void testLoading() {
var languages = LanguageLoader.getAllAvailableLanguages();
assertEquals(16, languages.size(), "Loaded Languages: " + languages.keySet());
assertEquals(17, languages.size(), "Loaded Languages: " + languages.keySet());
}

@Test
Expand Down
5 changes: 5 additions & 0 deletions coverage-report/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,11 @@
<artifactId>emf-model</artifactId>
<version>${revision}</version>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>llvmir</artifactId>
<version>${revision}</version>
</dependency>
</dependencies>
<build>
<plugins>
Expand Down
28 changes: 28 additions & 0 deletions languages/llvmir/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# JPlag LLVM IR language module

The JPlag LLVM IR module allows the use of JPlag with submissions in the LLVM IR. <br>
tsaglam marked this conversation as resolved.
Show resolved Hide resolved
It is based on the [LLVMIR ANTLR4 grammar](https://github.com/antlr/grammars-v4/tree/master/llvm-ir), licensed under MIT.

### LLVM IR specification compatibility

The grammar definition targets LLVM 15, released in September 2022.

The grammar in this repo contains a fix, see the comment in the [LLVM IR grammar](src/main/antlr4/de/jplag/llvmir/grammar/LLVMIR.g4).

If the grammar is updated to a more recent<a href="#footnote-1"><sup>1</sup></a> syntax definition, this module should surely be updated as well.
tsaglam marked this conversation as resolved.
Show resolved Hide resolved


### Token Extraction

The choice of tokens includes nesting tokens for functions and basic blocks and separate tokens for various elements.
These include binary and bitwise instructions (like addition and or), memory operations (like load and store), terminator instructions (like branches), conversions, global variables, type definitions, constants and others.


### Usage

To use the LLVM IR module, add the `-l llvmir` flag in the CLI, or use a `JPlagOption` object with `new de.jplag.llvmir.LLVMIRLanguage()` as `language` in the Java API as described in the usage information in the [readme of the main project](https://github.com/jplag/JPlag#usage) and [in the wiki](https://github.com/jplag/JPlag/wiki/1.-How-to-Use-JPlag).

<br>

#### Footnotes
<section id="footnote-1"><sup>1 </sup>The grammar files are taken from grammar-v4, with the most recent modification in <a href="https://github.com/antlr/grammars-v4/tree/768b12e1db509aa700a316e3eed1e23e8c4bdb06/llvm-ir">commit 768b12e</a> from August 2023.</section>
39 changes: 39 additions & 0 deletions languages/llvmir/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>de.jplag</groupId>
<artifactId>languages</artifactId>
<version>${revision}</version>
</parent>
<artifactId>llvmir</artifactId>

<dependencies>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4-runtime</artifactId>
</dependency>
<dependency>
<groupId>de.jplag</groupId>
<artifactId>language-antlr-utils</artifactId>
<version>${revision}</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>antlr4</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Loading
Loading