Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Stalls #2

Open
spacehamster opened this issue Aug 18, 2020 · 102 comments
Open

Export Stalls #2

spacehamster opened this issue Aug 18, 2020 · 102 comments

Comments

@spacehamster
Copy link
Contributor

I've started to implement exporting here https://github.com/spacehamster/TypeTreeDumper/tree/working, and issues with code that I would have expected to cause a hard crash cause the process to stall. Closing the terminal leaves a zombie unity process alive. There are no stack traces logged in Editor.log either. I wonder if there are good ways to debug issues like that?

@DaZombieKiller
Copy link
Owner

I took a quick look through the code, and I saw an issue that could be causing stalls: Produce was changed to take in byte instead of in RuntimeTypeInfo. in can cause a copy to be made, which would truncate the data sent to the function to the size of a byte. It should use ref instead of in.

I also noticed that the managed delegate for GenerateTypeTree is wrong. The order for that one should be object, tree, flags instead of object, flags, tree. Unless you're testing with a version of Unity that lacks TypeTreeCache I don't think that's the cause of the hang though.

The zombie process could probably be dealt with by terminating the Unity process from the main console app when it closes. I'm not sure about how other issues could be debugged though.

@spacehamster
Copy link
Contributor Author

Thanks for the help with Produce and GenerateTypeTree. Debugging with logging to console had narrowed it down to something to do with object produce, I wasn't aware I had screwed up GenerateTypeTree too.

The question was mostly about debugging, I'm not very experienced with C# interop, so I expect to make a bunch of dumb mistakes like that, and debugging is a bit harder without the stacktraces unity was giving when it crashed. I'll look into it a bit more and see if I can come up with a good solution.

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Aug 18, 2020

I did some cursory searching, and apparently adding -logFile (with no file name listed afterwards) will cause Unity to route all logging to stdout, so you could potentially retrieve any stack traces in the main console program by reading from it.

Edit: According to Unity documentation, the parameter is -logfile - to log to stdout on Windows.

@spacehamster
Copy link
Contributor Author

spacehamster commented Aug 18, 2020

Ah, I had assumed that it was automatically writing logs to C:\Users\username\AppData\Local\Unity\Editor\Editor.log, but it seems like you need to manually set a log path. I had been reading stale log files.

After fixing the log path and the Object::Produce signature, the log shows:

Cannot create on non-main thread without kCreateObjectFromNonMainThread

Assertion failed on expression: 'CurrentThread::IsMainThread()

and stalls while calling DestroyImmediate. It seems it doesn't like creating and destroying objects from outside the main thread.

Also passing ObjectCreationMode.FromNonMainThread to Object::Produce changes the log to this.

Cannot create on non-main thread without an instanceID

Assertion failed on expression: 'CurrentThread::IsMainThread()'

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Aug 18, 2020

Yeah, you need to create an instance ID yourself if you want to create an object outside of the main thread. I think there's a function for that somewhere in the code, I just don't know the signature. (Edit: It's ?AllocateNextLowestInstanceID@@YAHXZ, but I don't know if it's safe to call outside the main thread. It doesn't seem to check, at least. Also, I haven't checked Unity 4, but it's not present in Unity 3.)

For destroying objects outside of the main thread, we might be able to make use of the dummy project. Unity has an -executeMethod parameter that can be used to call a static method after loading the project, which could connect to the injected TypeTreeDumper.Client assembly somehow and provide an API for queuing work on the main thread.

Edit: Alternatively, we could hook the update loop with EasyHook, which would place us on the main thread. It looks like ?Update@SceneTracker@@QEAAXXZ is where the EditorApplication.update event is triggered.

@spacehamster
Copy link
Contributor Author

I was going to suggest trying to use -executeMethod to trigger the main thread to somehow call into TypeTreeDumper.Client to get it running on the main thread, but if EasyHook can hook into the update loop, that would be even better.

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Aug 18, 2020

The AfterEverythingLoaded callback might actually be running on the main thread too, so it's worth experimenting with running more code in that.

Edit: I tried printing the result of CurrentThread::IsMainThread in that callback and got True.

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Aug 18, 2020

Something interesting to note is that using SymbolResolver.Resolve/SymbolResolver.ResolveFunction within that function can cause a hang. I think the cause is DIA, because if I resolve that symbol in the Run function first, there is no hang (DiaSymbolResolver caches previously resolved symbols, so DIA wouldn't run the second time). I have no idea why that would happen though.

Edit: It seems DIA does not like being used from multiple threads. If I create a completely new DIA session on the main thread, there are no hangs. So it might be worth turning the symbol resolver and DIA-related fields into ThreadLocal<T> fields.

@spacehamster
Copy link
Contributor Author

Hmm, I wonder if dbghelp.dll has the same issue. The thread issue probably isn't big enough problem to switch over even if it doesn't.

@DaZombieKiller
Copy link
Owner

I just pushed two commits that fix cross-thread usage of DiaSymbolResolver, it was actually much simpler than the solution I just proposed above.

@spacehamster
Copy link
Contributor Author

spacehamster commented Sep 1, 2020

The AfterEverythingLoaded callback might actually be running on the main thread too, so it's worth experimenting with running more code in that.
Edit: I tried printing the result of CurrentThread::IsMainThread in that callback and got True.

I tried calling ?IsMainThread@CurrentThread@@YA_NXZ and it returns true when called inside the AfterEverythingLoaded callback and when called from EntryPoint.Run, there seems to be something wrong with that call.

Moving the dumping logic to the AfterEverythingLoaded callback fixes the stall.

The output is not valid, but i'm still looking into that

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Sep 1, 2020

I tried calling ?IsMainThread@CurrentThread@@YA_NXZ and it returns true when called inside the AfterEverythingLoaded callback and when called from EntryPoint.Run, there seems to be something wrong with that call.

This is caused by an incorrect P/Invoke signature. The default behaviour for bool is to marshal a WinAPI BOOL type, instead of a C++ bool. To fix this, use [return: MarshalAs(UnmanagedType.U1)]:

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
[return: MarshalAs(UnmanagedType.U1)]
delegate bool IsMainThreadDelegate();

I think I've actually made this mistake on the TypeTreeCache delegates (and any other bool-returning delegates), since I don't think they return a WinAPI BOOL but a C++ one. They should probably also specify U1 for return marshalling.

@spacehamster
Copy link
Contributor Author

Regarding the export stall with Unity 5.6.7, I created a c++ dumper to help with debugging https://github.com/spacehamster/NativeTypeTreeDumper.
When using TypeTreeDumper, the editor log doesn't have a stacktrace, while the c++ dumper does.

TypeTreeDumper Editor.log
C++ Dumper Editor.log

The stacktrace looks like this

0x000000014104B685 (Unity) MemoryProfiler::RegisterRootAllocation
0x0000000140784077 (Unity) assign_allocation_root
0x000000014013E511 (Unity) BaseObjectInternal::NewObject<AssetMetaData>
0x00000001401450B2 (Unity) ProduceHelper<AssetMetaData,0>::Produce
0x0000000140927067 (Unity) Object::Produce

and some testing shows that setting MemLabelId.Identifier to 0x32 stops the stall in TypeTreeDumper and stops the crash in the c++ dumper. An identifier of 0x32 is equivalent to ?kMemBaseObject@@3UMemLabelId@@A in 5.6.7, but I have not checked more recent versions.

@DaZombieKiller
Copy link
Owner

When using TypeTreeDumper, the editor log doesn't have a stacktrace, while the c++ dumper does.

I wonder if there's a way for us to still have a native stack trace in managed code. I can't imagine it's easy though, and probably requires a bunch of managed<->unmanaged hopping around.

@spacehamster
Copy link
Contributor Author

I'm looking at how to handle STL strings for versions 5.4 and lower. basic_string::c_str has a slightly different signature in each version.
5.4:

PublicSymbol: [0000EAB0][0001:0000DAB0] 
?c_str@?$basic_string@DU?$char_traits@D@std@@V?$stl_allocator@D$0EC@$0BA@@@@std@@QEBAPEBDXZ
public: char const * __cdecl std::basic_string<char,struct std::char_traits<char>,class stl_allocator<char,66,16> >::c_str(void)const

5.3:

PublicSymbol: [00D11560][0001:00D10560] 
?c_str@?$basic_string@DU?$char_traits@D@std@@V?$stl_allocator@D$0EB@$0BA@@@@std@@QEBAPEBDXZ
public: char const * __cdecl std::basic_string<char,struct std::char_traits<char>,class stl_allocator<char,65,16> >::c_str(void)const

5.2:

PublicSymbol: [0034B010][0001:0034A010] 
?c_str@?$basic_string@DU?$char_traits@D@std@@V?$stl_allocator@D$0CP@$0BA@@@@std@@QEBAPEBDXZ
public: char const * __cdecl std::basic_string<char,struct std::char_traits<char>,class stl_allocator<char,47,16> >::c_str(void)const

PublicSymbol: [0034B010][0001:0034A010]
?c_str@?$basic_string@DU?$char_traits@D@std@@V?$stl_allocator@D$0DL@$0BA@@@@std@@QEBAPEBDXZ
public: char const * __cdecl std::basic_string<char,struct std::char_traits<char>,class stl_allocator<char,59,16> >::c_str(void)const 

Etcetera, so I think symbol resolver needs to be extended to support finding symbols that match a prefix (starting with ?c_str@?$basic_string).

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Sep 14, 2020

public partial class SymbolResolver
{
    public abstract string[] FindSymbolsWithPrefix(string prefix);
}

I think an API like this would be fine. There can also be a few helper methods such as:

public partial class SymbolResolver
{
    public T* ResolveFirstWithPrefix<T>(string prefix) where T : unmanaged;
    public T  ResolveFirstFunctionWithPrefix<T>(string prefix) where T : Delegate;
}

So then we could just do:

resolver.ResolveFirstFunctionWithPrefix<CStrDelegate>("?c_str@?$basic_string@")

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Sep 14, 2020

Just pushed support for this on master. Since DIA supports Regex, I changed the API design a little bit to reflect that.

c_str can be found with:

resolver.ResolveFirstFunctionMatching<CStrDelegate>(new Regex(@"\?c_str@\?\$basic_string@*"));

@spacehamster
Copy link
Contributor Author

This line doesn't print out the exception

Console.Error.WriteLine(ex);

but if you change it to Console.Error.WriteLine(ex.ToString());, it does. i'm not sure why that would be the case

@DaZombieKiller
Copy link
Owner

That's strange, I wonder if it's because it's being set to IpcInterface.Error which is tied to the server, and thus it can only take certain types. We might have to make a wrapper TextWriter to work around that. If what I'm thinking is correct, then Console.WriteLine(ex); should work, but Console.Out.WriteLine(ex); shouldn't.

@DaZombieKiller
Copy link
Owner

Hm, I still see exceptions printed out even with Console.Error.WriteLine(ex);. Do you have an example of a situation where it fails so I can experiment with it?

@spacehamster
Copy link
Contributor Author

spacehamster commented Sep 14, 2020

I just add throw new UnresolvedSymbolException("Missing Symbol Name Here"); to the top of ExecuteDumper like this

void ExecuteDumper()
{
    throw new UnresolvedSymbolException("Missing Symbol Name Here");
    var GetUnityVersion   = resolver.ResolveFunction<GetUnityVersionDelegate>("?GameEngineVersion@PlatformWrapper@UnityEngine@@SAPEBDXZ");
    var ParseUnityVersion = resolver.ResolveFunction<UnityVersionDelegate>("??0UnityVersion@@QEAA@PEBD@Z");
    ParseUnityVersion(out UnityVersion version, Marshal.PtrToStringAnsi(GetUnityVersion()));
    Dumper.Execute(new UnityEngine(version, resolver), server.OutputDirectory);
}

It also appears that if I change UnresolvedSymbolException to System.Exception, it prints like normal.

@DaZombieKiller
Copy link
Owner

I can confirm that a wrapper TextWriter fixes the issue. Fixed in 81d2e56.

@spacehamster
Copy link
Contributor Author

Do you know why it effects UnresolvedSymbolException but not System.Exception?

@DaZombieKiller
Copy link
Owner

That's because UnresolvedSymbolException is a custom exception that doesn't implement serialization, which would be necessary for it to travel between processes.

@spacehamster
Copy link
Contributor Author

Remote hooking Unity 4.7 doesn't work, a blank console pops up and nothing happens. I'm guessing it is because it is a 32 bit app? I believe 5.0 onwards are 64 bit.

@DaZombieKiller
Copy link
Owner

That's probably why, yeah. It would probably work if you enable Prefer 32 Bit for a build, and you'll need to register the 32-bit msdia140.dll as well.

@DaZombieKiller
Copy link
Owner

DaZombieKiller commented Sep 15, 2020

The most recent commit on master should now work when the dumper is compiled for x86 (actually, just Any CPU should work fine too) with no further changes.

@spacehamster
Copy link
Contributor Author

spacehamster commented Sep 16, 2020

WIth Unity 4.7, ClassIDToRTTI always returns null. I've not been able to figure out why.

Also, side note, AfterEverythingLoaded was changed to a __thiscall in 4.7.

@DaZombieKiller
Copy link
Owner

WIth Unity 4.7, ClassIDToRTTI always returns null. I've not been able to figure out why.

It might require some investigation in Ghidra.

Also, side note, AfterEverythingLoaded was changed to a __thiscall in 4.7.

Thanks for the heads up, I haven't verified most of the calling conventions since it only applies to x86 and not x64.

@spacehamster
Copy link
Contributor Author

spacehamster commented Sep 16, 2020

The function seems really simple

/* public: static struct Object::RTTI * __cdecl Object::ClassIDToRTTI(int) */
RTTI * __cdecl ClassIDToRTTI(int param_1)
{
  _Tree<class_std::_Tmap_traits<int,struct_Object::RTTI,struct_std::less<int>,class_stl_allocator<struct_std::pair<int_const_,struct_Object::RTTI>,1,4>,0>_>
  *p_Var1;
    
  _Tree_iterator<std::_Tree_val<std::_Tmap_traits<int,Object::RTTI,std::less<int>,stl_allocator<std::pair<intconst,Object::RTTI>,1,4>,0>>>
  i;
  
  p_Var1 = gRTTI;
  find(gRTTI,(int *)&i);
  if (i == *(
             _Tree_iterator<std::_Tree_val<std::_Tmap_traits<int,Object::RTTI,std::less<int>,stl_allocator<std::pair<intconst,Object::RTTI>,1,4>,0>>>
             *)(p_Var1 + 4)) {
    return (RTTI *)0x0;
  }
  return (RTTI *)((int)i + 0x10);
}

(BTW i renamed gRTTI, it doesn't have a symbol associated with it)

My first guess was that gRTTI was not being initialized, so I tried calling RegisterAllClasses and InitializeAllClasses, but that does not help (it warns that it can't register classes multiple times). Changing the entry point to InitializeEngineNoGraphics also didn't help

@spacehamster
Copy link
Contributor Author

I wanted to use some of the code in other projects, is it possible to add a license?

@DaZombieKiller
Copy link
Owner

Sure, I'll add the MIT license to the repo

@DaZombieKiller
Copy link
Owner

Done a7657a1

@DaZombieKiller
Copy link
Owner

Here's a JSON file that maps Unity version and build numbers to directory names under Unity_x64_mono.pdb on the symbol server: https://gist.github.com/DaZombieKiller/0a848dc59436ce23ef86e4206dfa8979

Interestingly, this reveals that it only contains PDBs for Unity 2018.1 through to Unity 2019.1. Generating these JSON files is a long process, so I'm still working on generating one for UnityPlayer_Win64_development_mono_x64.pdb.

@spacehamster
Copy link
Contributor Author

spacehamster commented Oct 26, 2020

Interestingly, this reveals that it only contains PDBs for Unity 2018.1 through to Unity 2019.1.

That matches up with what I experienced.

I'm looking at adding dumping support for versions older then 3.4-4.7, but i'm having a bit of trouble. The TypeTree format has changed so there are no TypeTreeNodes any more. The main issue is that TypeTree children are stored in an std::list, but it seems that all of the std::list functions are inlined, so there are no suitable functions to call. I had planned to use a TypeTreeIterator to access children instead, but it appears that was only introduced in 4.7, so the 3.4-4.6 would still be unsolved.

image

I suspect even if you used c++ to access it with the std::list functions like this does https://gist.github.com/robert-nix/7db0145e809b692b63f2#file-unitystructgen-cpp-L72, it would not work unless you used the same version of the standard library runtime that unity compiled against.

@DaZombieKiller
Copy link
Owner

Yeah, I've been working on a refactoring of some of the engine interop code and noticed the usage of std::list in older versions. I've pretty much hit the same wall there. I guess it could be possible to make a native DLL that exposes an API for interacting with std::list, and then we just need to ensure that it compiles under the same version of the standard library.

@spacehamster
Copy link
Contributor Author

Another idea I had considered was to try and reimplment the std::list standard library functions in C#, but I did not have much success when I tried that idea for std::string.c_str, so i'm not especially optimistic about that approach.

@DaZombieKiller
Copy link
Owner

I considered doing that, but I think for now it's very much a last resort approach. It'd be nice to have a managed implementation at some point, but I'd like to exhaust the simpler options first.

@spacehamster
Copy link
Contributor Author

There is a function void __cdecl WalkTypeTree(class TypeTree const &,unsigned char const *,int *)) which seems like it might help, but I can't figure out what it is actually doing.

@DaZombieKiller
Copy link
Owner

Since the API exposed by the generated Dia2Lib is messy and doesn't expose some things properly, I've been working on a replacement for it: https://github.com/DaZombieKiller/Dia2. It has XML documentation comments for the entire public API surface based on the DIA SDK Reference.

I haven't finished porting everything across though, so it still depends on Dia2Lib for now. I'm currently using this in a GUI program for viewing the contents of PDB files (and for generating C++ and C# structures from them) which I'll also put up on GitHub once it's ready.

Ideally we should be able to turn the Symbol class into a bunch of other classes (CompilandSymbol, PublicSymbol, etc) to make it clearer which fields are actually usable on a particular symbol.

I also noticed that Ghidra has support for importing .pdb.xml files which it generates from .pdb files on Windows. I'm going to see if I can allow my tool to output in this format too with extra customization, so we can merge type information.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

I've been playing around with the older versions of unity, and have had some success accessing TypeTreeLists using Visual Studio 2019 (v142) like this:

#define _ITERATOR_DEBUG_LEVEL 0
#include <string>
#include <list>
#define EXPORT  extern "C" __declspec(dllexport)

struct TypeTree;

template <class T = char>
class stl_allocator : public std::allocator<T>
{
public:
	template<class _Other>
	struct rebind
	{
		typedef stl_allocator_new<_Other> other;
	};

	stl_allocator_new()
	{
	}
	stl_allocator_new(const std::allocator<T>&)
	{
	}
};

typedef std::list<TypeTree, stl_allocator<TypeTree>> TypeTreeList;
typedef std::basic_string<char, std::char_traits<char>, stl_allocator<char>> TypeTreeString;

struct TypeTree
{
	TypeTreeList m_Children;
	int _dummy1;
	int _dummy2;
	TypeTree* m_Father;
	TypeTreeString m_Type;
	int _dummy3;
	TypeTreeString m_Name;
	int _dummy4;
	int m_ByteSize;
	int m_Index;
	int m_IsArray;
	int m_Version;
	int m_MetaFlag;
	int m_ByteOffset;
	void* m_DirectPtr;
};


EXPORT const char* StringToCStr(std::string* str) {
	return str->c_str();
}

EXPORT unsigned int ListSize(TypeTreeList* list) {
	return list->size();
}

EXPORT TypeTreeNew& ListGet(TypeTreeList* list, int index) {
	int i = 0;
	for (auto it = list->begin(); it != list->end(); it++)
	{
		if (i == index) {
			return *it;
		}
		i++;
	}
	TypeTreeNew result;
	return result;
}

csharp

[DllImport("StlHelper", CallingConvention = CallingConvention.Cdecl)]
internal static extern IntPtr StringToCStr(void* str);

[DllImport("StlHelper", CallingConvention = CallingConvention.Cdecl)]
internal static extern int ListSize(void* list);

[DllImport("StlHelper", CallingConvention = CallingConvention.Cdecl)]
internal static extern TypeTree* ListGet(void* list, int index);

It doesn't work when i add int rootref; to the stl_allocator, it causes the struct alignment to become invalid, i'm not quite sure why.

TypeTreeList seems to be implemented as a doubly linked list, and i think it would be cleaner to access it purely from c#, like this

unsafe struct TypeTreeList
{
	public TypeTreeListNode* Head;
	public uint Size;
	public int Padding1;
	public int Padding2;
}
unsafe struct TypeTreeListNode
{
	public TypeTreeNode* Next;
	public TypeTreeNode* Prev;
	public TypeTree Value;
}

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

Did some more testing, and was able to successfully walk the tree without the c++ helpers like this

public void CreateNodes(ManagedTypeTree owner, ref List<TypeTreeNode> nodes, ref TypeTree tree, int level = 0)
{
	var typeIndex = GetOrCreateStringIndex(tree.m_Type);
	var nameIndex = GetOrCreateStringIndex(tree.m_Name);
	var nodeImpl = new TypeTreeNode.V1(
		version: (short)tree.m_Version,
		level: (byte)level,
		typeFlags: (TypeFlags)tree.m_IsArray,
		typeStrOffset: typeIndex,
		nameStrOffset: nameIndex,
		byteSize: tree.m_ByteSize,
		index: tree.m_Index,
		metaFlag: tree.m_MetaFlag);
	nodes.Add(new TypeTreeNode(nodeImpl, owner));

	TypeTreeListNode* node = tree.m_Children.Head;
	for(int i = 0; i < tree.m_Children.Size; i++)
	{
		node = node->Next;
		TypeTree child = node->Value;
		CreateNodes(owner, ref nodes, ref child, level + 1);
	}
}

I think accessing it from c# is the cleanest solution

@DaZombieKiller
Copy link
Owner

Nice! So is this for Unity 3.4? Or is there an even earlier version available on Windows?

I think accessing it from c# is the cleanest solution

I agree, it also keeps the build system simple since we don't need to include a C++ project.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

I've only tested this in 4.5, but i'm hoping the TypeTreeList layout will be the same with 3.4 and the same method will work there too. I haven't been able to find any older versions of unity then 3.4 available.

@DaZombieKiller
Copy link
Owner

Here's part of the code for the 3.4 TypeTree from the refactor I've been working on:

using System;

namespace Unity.V3_4
{
    public unsafe partial struct TypeTree
    {
        public CppList Children;
        public TypeTree* Father;
        public CppString Type;
        public CppString Name;
        public int ByteSize;
        public int Index;
        public int IsArray;
        public int Version;
        public TransferMetaFlags MetaFlag;
        public int ByteOffset;
        public IntPtr DirectPtr;
    }
}

Does that look similar?

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

yes

internal struct TypeTreeString
{
	fixed byte Data[28];
};

internal unsafe struct TypeTreeList
{
	public TypeTreeListNode* Head;
	public uint Size;
	public int Padding1;
	public int Padding2;
};

internal unsafe struct TypeTreeListNode
{
	public TypeTreeListNode* Next;
	public TypeTreeListNode* Prev;
	public TypeTree Value;
}

internal struct TypeTree
{
	public TypeTreeList m_Children;
	public TypeTree* m_Father;
	public TypeTreeString m_Type;
	public TypeTreeString m_Name;
	public int m_ByteSize;
	public int m_Index;
	public int m_IsArray;
	public int m_Version;
	public TransferMetaFlags m_MetaFlag;
	public int m_ByteOffset;
	public void* m_DirectPtr;
}

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

Also TypeTreeString's implementation seems to be identical to this https://stackoverflow.com/questions/40393350/why-does-microsofts-implementation-of-stdstring-require-40-bytes-on-the-stack (except for the 4 bytes of padding at the end that appears unused)
so if there is ever a version that is missing c_str i think it should be possible to access it from C#.

@spacehamster
Copy link
Contributor Author

How are you running TypeTreeDumper on the 3.4 player? Are you building a custom .exe or can you run Data\PlaybackEngines\windowsdevelopmentstandaloneplayer\player_win.exe directly?

@DaZombieKiller
Copy link
Owner

How are you running TypeTreeDumper on the 3.4 player? Are you building a custom .exe or can you run Data\PlaybackEngines\windowsdevelopmentstandaloneplayer\player_win.exe directly?

I just run it on player_win.exe and hook PlayerInitEngineNoGraphics instead of AfterEverythingLoaded.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 2, 2020

I get this message trying to run player_win.exe directly
image
I assume i'm meant to build a game then?

@DaZombieKiller
Copy link
Owner

Ah, yeah. You need to build a small dummy game so that the player will boot.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 9, 2020

So I got a bit suspicious when the disassembly in visual studio didn't match up with the disassembly in ghidra in Unity 3.5. If you dump the raw bytes for the function AfterEverythingLoaded before RemoteHooking.WakeUpProcess is called, you get the same results as ghidra.
image
Raw Memory

AfterEverythingLoaded: 00B8EF90
0000: 427D7493AD2BF468
0008: 66F64A989697643D
0010: FD51BE92DC7BD246
0018: 9F49EE2112D0B664
0020: 2E9C4445B69B03FB
0028: EE03BD39E50191F9
0030: F464B72DCF81A5BD
0038: A8BB68AF8A960BB8
0040: C10DC66900C5867D
0048: E2E953E458CF3BD4
0050: 5D013CF256DB43C6
0058: 4BEC6066AC4AABC3
0060: CB906C5650FB92F6
0068: 3D2CF452767CB545
0070: 9E0EC533BB7C9DD2
0078: E6FE955DFDFC8A09

Putting those bytes through an online decompiler says its junk.
But if you wait until after -executeMethod Loader.Load is called with the legacy loading method, the raw memory looks like this

AfterEverythingLoaded: 00B8EF90
0000: 558BEC6AFF68F8ED
0008: 160164A100000000
0010: 5083EC5CA1080B6B
0018: 0133C58945F05356
0020: 57508D45F464A300
0028: 000000E890E7CCFF
0030: 8BC8E819E8CCFFE8
0038: 042760008B702433
0040: DBBF0F000000C745
0048: D42B000000897DEC
0050: 895DE8885DD8895D
0058: FCE8727DE5FF8B48
0060: 108B50140FACD102
0068: C1EA0280E1018D4D
0070: D474096A0468781E
0078: 3E01EB076A076870

and running that through a disassembler produces actual asm code.

0:  55                      push   ebp
1:  8b ec                   mov    ebp,esp
3:  6a ff                   push   0xffffffff
5:  68 f8 ed 16 01          push   0x116edf8
a:  64 a1 00 00 00 00       mov    eax,fs:0x0
10: 50                      push   eax
11: 83 ec 5c                sub    esp,0x5c
14: a1 08 0b 6b 01          mov    eax,ds:0x16b0b08
19: 33 c5                   xor    eax,ebp
1b: 89 45 f0                mov    DWORD PTR [ebp-0x10],eax
1e: 53                      push   ebx
1f: 56                      push   esi
20: 57                      push   edi
21: 50                      push   eax
22: 8d 45 f4                lea    eax,[ebp-0xc]
25: 64 a3 00 00 00 00       mov    fs:0x0,eax
2b: e8 90 e7 cc ff          call   0xffcce7c0
30: 8b c8                   mov    ecx,eax
32: e8 19 e8 cc ff          call   0xffcce850
37: e8 04 27 60 00          call   0x602740
3c: 8b 70 24                mov    esi,DWORD PTR [eax+0x24]
3f: 33 db                   xor    ebx,ebx
41: bf 0f 00 00 00          mov    edi,0xf
46: c7 45 d4 2b 00 00 00    mov    DWORD PTR [ebp-0x2c],0x2b
4d: 89 7d ec                mov    DWORD PTR [ebp-0x14],edi
50: 89 5d e8                mov    DWORD PTR [ebp-0x18],ebx
53: 88 5d d8                mov    BYTE PTR [ebp-0x28],bl
56: 89 5d fc                mov    DWORD PTR [ebp-0x4],ebx
59: e8 72 7d e5 ff          call   0xffe57dd0
5e: 8b 48 10                mov    ecx,DWORD PTR [eax+0x10]
61: 8b 50 14                mov    edx,DWORD PTR [eax+0x14]
64: 0f ac d1 02             shrd   ecx,edx,0x2
68: c1 ea 02                shr    edx,0x2
6b: 80 e1 01                and    cl,0x1
6e: 8d 4d d4                lea    ecx,[ebp-0x2c]
71: 74 09                   je     0x7c
73: 6a 04                   push   0x4
75: 68 78 1e 3e 01          push   0x13e1e78
7a: eb 07                   jmp    0x83
7c: 6a 07                   push   0x7
7e: 68                      .byte 0x68
7f: 70                      .byte 0x70

I suspect the exe is encrypted, compressed or somehow obfuscated and the hooks can't be set until after the valid executable has been loaded into memory. It makes reverse engineering difficult because the exe on disk is not valid.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 9, 2020

I got a proof of concept for hooking to work in 3.5 like this

[SuppressMessage("Style", "IDE0060", Justification = "Required by EasyHook")]
public void Run(RemoteHooking.IContext context, EntryPointArgs args){
	...snip...
	OnEngineInitialized += ExecuteDumper;
	RemoteHooking.WakeUpProcess();
	if (VersionInfo.FileMajorPart == 3)
	{
		HookAfterEverythingLoadLegacy();
	}
	Thread.Sleep(Timeout.Infinite);
	...snip...
}
unsafe void HookAfterEverythingLoadLegacy()
{
	Console.WriteLine("Trying to hook AfterEverythingLoaded");
	resolver.TryResolveFirstMatching(new Regex(Regex.Escape("?AfterEverythingLoaded@Application@") + "*"), out var address);
	byte* buffer = (byte*)address;
	byte[] matchingCodes = new byte[] {
		0x55, 0x8B, 0xEC, 0x6A, 0xFF, 0x68, 0xF8, 0xED
	};
	//Wait for target to be decrypted
	bool validCodes = false;
	while (!validCodes)
	{
		bool foundMatch = true;
		for(int i = 0; i < matchingCodes.Length; i++)
		{
			if (buffer[i] != matchingCodes[i]) foundMatch = false;
		}
		validCodes = foundMatch;
	}
	Console.WriteLine("Hooking AfterEverythingLoaded");
	AfterEverythingLoadedHook = LocalHook.Create(address, new AfterEverythingLoadedDelegate(AfterEverythingLoaded), null);
	AfterEverythingLoadedHook.ThreadACL.SetExclusiveACL(Array.Empty<int>());
	Console.WriteLine("Hooked AfterEverythingLoaded");
}

This method requires you to know what the valid function bytes are in advanced, but suspect you might be able to just watch for changed bytes instead of valid bytes. For reverse engineering, you can dump the process from memory and feed that into IDA. I tried with ghidra, but it ghidra refuses to load the associated .pdb if its not an exact match to the .exe.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 9, 2020

I'm wondering which approach you prefer, dumping the 3.5 editor or dumping the 3.5 player?

@DaZombieKiller
Copy link
Owner

Dumping from the editor is more consistent, so in general I think it's nice to support it. Ideally we can support both, so I think it's a good idea to implement your solution.

@spacehamster
Copy link
Contributor Author

spacehamster commented Nov 9, 2020

For initializing the AfterEverythingLoadCallback, do you prefer the commandline -executeMethod method or the wait for decrypt and then easyhook method?
The commandline method seems safer as the wait for decrypt method introduces a race condition were it is possible for the method to be decrypted and then the hook be set after the target function is called.

I believe that by the time AfterEverythingLoad is called, it should be safe to hook functions like normal

@DaZombieKiller
Copy link
Owner

Yeah, the executeMethod approach seems better for now and far less error prone. We'll just need to move all of the hooking stuff out into its own function so AfterEverythingLoaded can call it on 3.5.

@DaZombieKiller
Copy link
Owner

I did a bit of thinking and figured out that we can pass data from the dumper to the editor by using process environment variables. I've replaced the template string approach with this in ddec84f.

@spacehamster
Copy link
Contributor Author

Interesting, so 3.4 passes around allocator pointers instead then? Hopefully there's an easy way to access the right one. If there's no global function for retrieving it, we could hook a function that is always called in order to grab whatever allocator is passed to it.

If the base allocator passed into Object::Produce is null, it uses a default allocator. I'm gonna try that and see if it works
image

@DaZombieKiller
Copy link
Owner

Just wanted to mention that this experimental build of Unity 2020.2 contains a completely unstripped PDB: https://forum.unity.com/threads/experimental-contacts-modification-api.924809/#post-6427604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants