mirror of
https://github.com/hedge-dev/XenonRecomp.git
synced 2025-05-28 12:52:06 +00:00
222 lines
7.1 KiB
Markdown
222 lines
7.1 KiB
Markdown
<!--
|
|
Copyright © 2022 Rot127 <unisono@quyllur.org>
|
|
SPDX-License-Identifier: BSD-3
|
|
-->
|
|
|
|
# Architecture updater - Auto-Sync
|
|
|
|
`auto-sync` is the architecture update tool for Capstone.
|
|
Because the architecture modules of Capstone use mostly code from LLVM,
|
|
we need to update this part with every LLVM release. `auto-sync` helps
|
|
with this synchronization between LLVM and Capstone's modules by
|
|
automating most of it.
|
|
|
|
Please refer to [intro.md](intro.md) for an introduction about this tool.
|
|
|
|
## Install
|
|
|
|
#### Setup Python environment and Tree-sitter
|
|
|
|
```
|
|
cd <root-dir-Capstone>
|
|
# Python version must be at least 3.11
|
|
sudo apt install python3-venv
|
|
# Setup virtual environment in Capstone root dir
|
|
python3 -m venv ./.venv
|
|
source ./.venv/bin/activate
|
|
```
|
|
|
|
#### Install Auto-Sync framework
|
|
|
|
```
|
|
cd suite/auto-sync/
|
|
pip install -e .
|
|
```
|
|
|
|
#### Clone Capstones LLVM fork and build `llvm-tblgen`
|
|
|
|
```bash
|
|
git clone https://github.com/capstone-engine/llvm-capstone vendor/llvm_root/
|
|
cd llvm-capstone
|
|
git checkout auto-sync
|
|
mkdir build
|
|
cd build
|
|
# You can also build the "Release" version
|
|
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug ../llvm
|
|
cmake --build . --target llvm-tblgen --config Debug
|
|
cd ../../
|
|
```
|
|
|
|
#### Install `llvm-mc` and `FileCheck`
|
|
|
|
Additionally, we need `llvm-mc` and `FileCheck` to generate our regression tests.
|
|
You can build it, but it will take a lot of space on your hard drive.
|
|
You can also get the binaries [here](https://releases.llvm.org/download.html) or
|
|
install it with your package manager (usually something like `llvm-18-dev`).
|
|
Just ensure it is in your `PATH` as `llvm-mc` and `FileCheck` (not as `llvm-mc-18` or similar though!).
|
|
|
|
## Architecture
|
|
|
|
Please read [ARCHITECTURE.md](https://github.com/capstone-engine/capstone/blob/next/docs/ARCHITECTURE.md) to understand how Auto-Sync works.
|
|
|
|
This step is essential! Please don't skip it.
|
|
|
|
## Update an architecture
|
|
|
|
Updating an architecture module to the newest LLVM release, is only possible if it uses Auto-Sync.
|
|
Not all arch-modules support Auto-Sync yet.
|
|
|
|
Check if your architecture is supported.
|
|
|
|
```
|
|
./src/autosync/ASUpdater.py -h
|
|
```
|
|
|
|
Run the updater
|
|
|
|
```
|
|
./src/autosync/ASUpdater.py -a <ARCH>
|
|
```
|
|
|
|
## Update procedure
|
|
|
|
1. Run the `ASUpdater.py` script.
|
|
2. Compare the functions in `<ARCH>DisassemblerExtension.*` to LLVM (search the function names in the LLVM root)
|
|
and update them if necessary.
|
|
3. Try to build Capstone and fix the build errors.
|
|
|
|
|
|
## Post-processing steps
|
|
|
|
This update translates some LLVM C++ files to C.
|
|
Because the translation is not perfect (maybe it will some day)
|
|
you will get build errors if you try to compile Capstone.
|
|
|
|
The last step to finish the update is to fix those build errors by hand.
|
|
|
|
## Additional details
|
|
|
|
### Overview updated files
|
|
|
|
This is a rough overview what files of an architecture are updated and where they are coming from.
|
|
|
|
**Files originating from LLVM** (Automatically updated)
|
|
|
|
These files are LLVM source files which were translated from C++ to C
|
|
Not all the listed files below are used by each architecture.
|
|
But those are the most common.
|
|
|
|
- `<ARCH>Disassembler.*`: Bytes to `MCInst` decoder.
|
|
- `<ARCH>InstPrinter.*` or `<ARCH>AsmPrinter.*`: `MCInst` to asm string decoder.
|
|
- `<ARCH>BaseInfo.*`: Commonly use functions and definitions.
|
|
|
|
`*.inc` files are exclusively generated by LLVM TableGen backends:
|
|
|
|
`*.inc` files for the LLVM component are named like this:
|
|
- `<ARCH>Gen*.inc` (note: no `CS` in the name)
|
|
|
|
Additionally, we generate more details for Capstone with `llvm-tblgen`.
|
|
Like enums, operand details and other things.
|
|
|
|
They are saved also to `*.inc` files, but have the `CS` in the name to make them distinct from the LLVM generated files.
|
|
|
|
- `<ARCH>GenCS*.inc`
|
|
|
|
**Capstone module files** (Not automatically updated)
|
|
|
|
Those files are written by us:
|
|
|
|
- `<ARCH>DisassemblerExtension.*` All kind of functions which are needed by the LLVM component, but could not be generated or translated.
|
|
- `<ARCH>Mapping.*`: Binding code between the architecture module and the LLVM files. This is also where the detail is set.
|
|
- `<ARCH>Module.*`: Interface to the Capstone core.
|
|
|
|
### Relevant documentation and troubleshooting
|
|
|
|
**LLVM file translation**
|
|
|
|
For details about the C++ to C translation of the LLVM files refer to `CppTranslator/README.md`.
|
|
|
|
**Generated .inc files**
|
|
|
|
Documentation about the `.inc` file generation is in the [llvm-capstone](https://github.com/capstone-engine/llvm-capstone) repository.
|
|
|
|
**Troubleshooting**
|
|
|
|
- If some features aren't generated and are missing in the `.inc` files, make sure they are defined as `AssemblerPredicate` in the `.td` files.
|
|
|
|
Correct:
|
|
```
|
|
def In32BitMode : Predicate<"!Subtarget->isPPC64()">,
|
|
AssemblerPredicate<(all_of (not Feature64Bit)), "64bit">;
|
|
```
|
|
Incorrect:
|
|
```
|
|
def In32BitMode : Predicate<"!Subtarget->isPPC64()">;
|
|
```
|
|
|
|
**Formatting**
|
|
|
|
- If you make changes to the `CppTranslator` please format the files with `black` and `usort`
|
|
```
|
|
pip3 install black usort
|
|
python3 -m usort format src/autosync
|
|
python3 -m black src/autosync
|
|
```
|
|
|
|
## Refactor an architecture for Auto-Sync framework
|
|
|
|
Not all architecture modules support Auto-Sync yet.
|
|
Here is an overview of the steps to add support for it.
|
|
|
|
<hr>
|
|
|
|
To refactor one of them to use `auto-sync`, you need to add it to the configuration.
|
|
|
|
1. Add the architecture to the supported architectures list in `ASUpdater.py`.
|
|
2. Configure the `CppTranslator` for your architecture (`suite/auto-sync/CppTranslator/arch_config.json`)
|
|
|
|
Now, manually run the update commands within `ASUpdater.py` but *skip* the `Differ` step:
|
|
|
|
```
|
|
./Updater/ASUpdater.py -a <ARCH> -s IncGen Translate
|
|
```
|
|
|
|
The task after this is to:
|
|
|
|
- Replace leftover C++ syntax with its C equivalent.
|
|
- Implement the `add_cs_detail()` handler in `<ARCH>Mapping` for each operand type.
|
|
- Edit the main header file of the architecture (`include/capstone/<ARCH>.h`) to include the generated enums (see below)
|
|
- Add any missing logic to the translated files.
|
|
- Make it build and write tests.
|
|
- Run the Differ again and always select the old nodes.
|
|
|
|
**Notes:**
|
|
|
|
- Some generated enums must be included in the `include/capstone/<ARCH>.h` header.
|
|
At the position where the enum should be inserted, add a comment like this (don't remove the `<>` brackets):
|
|
|
|
```
|
|
// generate content <FILENAME.inc> begin
|
|
// generate content <FILENAME.inc> end
|
|
```
|
|
|
|
The update script will insert the content of the `.inc` file at this place.
|
|
|
|
- If you find yourself fixing the same syntax error multiple times,
|
|
please consider adding a `Patch` to the `CppTranslator` for this case.
|
|
|
|
- Please check out the implementation of ARM's `add_cs_detail()` before implementing your own.
|
|
|
|
- Running the `Differ` after everything is done, preserves your version of syntax corrections, and the next user can auto-apply them.
|
|
|
|
- Sometimes the LLVM code uses a single function from a larger source file.
|
|
It is not worth it to translate the whole file just for this function.
|
|
Bundle those lonely functions in `<ARCH>DisassemblerExtension.c`.
|
|
|
|
## Adding a new architecture
|
|
|
|
Adding a new architecture follows the same steps as above. With the exception that you need
|
|
to implement all the Capstone files from scratch.
|
|
|
|
Check out an `auto-sync` supporting architectures for guidance and open an issue if you need help.
|