Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better heuristics for symbols #23

Open
larsbrinkhoff opened this issue Sep 14, 2018 · 8 comments
Open

Better heuristics for symbols #23

larsbrinkhoff opened this issue Sep 14, 2018 · 8 comments

Comments

@larsbrinkhoff
Copy link
Owner

Improve how the symbol table is used for disassembly.

@larsbrinkhoff
Copy link
Owner Author

Examples:

  • jfcl u, 112541 - The JFCL AC field is usually a number.
  • consz cfhpo9, @usrpdl(q) - Prefer device codes for IO instructions.
  • addi a, l - Maybe in some cases prefer numbers for immediate operands?
  • xctri b, 105657 - Prefer X symbols for XCTR and XCTRI.
  • skipn tabp, shutdn - When the AC is optional and is 0, leave the field blank rather than use a symbol which is 0.

@larsbrinkhoff
Copy link
Owner Author

In the AC field, the symbols A-E, P, T, TT could receive some special treatment. Maybe more. This is just a heuristic, but the vast majority of ITS programs use these AC definitions.

@larsbrinkhoff
Copy link
Owner Author

More examples:

  • .logout unfn1, - If AC is 0, disassemble as .logout.
  • .value unfn1 - If argument is 0, don't print it.

@larsbrinkhoff
Copy link
Owner Author

CC @atsampson, maybe you have some ideas around this?

@atsampson
Copy link
Collaborator

atsampson commented Sep 2, 2020

I was thinking about this again recently...

Symbols with a value of 0-17 that are neither killed nor halfkilled are probably AC names. When symbolising instruction fields, have some hints about which fields are/aren't likely to be ACs - e.g. SETZM 1(A) is more likely than SETZM A(1) or SETZM A(A), and MOVEI A,1 is more likely than MOVEI A,A.

And as you suggested above, if I have A==1 and FOOBIT==1, then the symboliser should prefer A. Maybe just preferring shorter symbol names would work for this.

For STINK programs with multiple sections of symbols, it'd be nice to have a way of ignoring some sections (e.g. the AGC symbols in Muddle, which are only valid when the GC is mapped in).

@larsbrinkhoff
Copy link
Owner Author

Useful hints:

  • Opcode field. Possibly match LUUOs?
  • Device code from IOT instructions.
  • Accumulator or index field. I don't think it's useful to have those separate.
  • Accumulator field as number. E.g. JRST and .LOGOUT.
  • Accumulator field from XCTR.
  • Address field as memory reference.
  • Address field as immediate.
  • Accumulator = 0 is not printed: SKIP etc.
  • Address = 0 is not printed: .VALUE etc.

Some of these combine in interesting ways.

@larsbrinkhoff
Copy link
Owner Author

Maybe a hint that the address is often a literal, to be disassembled in line?

@larsbrinkhoff
Copy link
Owner Author

Accumulator or address 0 is already implemented as PDP10_A_UNUSED and PDP10_E_UNUSED.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants