Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run OSCI in 2023? #174

Open
StenGruener opened this issue Jul 11, 2023 · 1 comment
Open

How to run OSCI in 2023? #174

StenGruener opened this issue Jul 11, 2023 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@StenGruener
Copy link

StenGruener commented Jul 11, 2023

Dear team,

i am trying to run a local osci instllation to get some stats. After failing installing tools in ubuntu 22.04 lts, i switched to 20.04 lts and got at least dependencies installed using pip and pyhton 3.8.

Now the first step of the pipeline is failing, i.e. running python3 osci-cli.py get-github-daily-push-events -d 2020-01-02

[2023-07-11 12:09:27,286] [ERROR] Failed to parse json: . Error: Expecting value: line 1 column 1 (char 0) [2023-07-11 12:09:27,329] [INFO] Save push events commits for 2020-01-02 00:00:00 into file /data/landing/github/events/push/2020/01/02/2020-01-02-0.parquet Traceback (most recent call last): File "osci-cli.py", line 93, in <module> cli(standalone_mode=False) File "/home/sten/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__ return self.main(*args, **kwargs) File "/home/sten/.local/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/sten/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/sten/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/sten/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/sten/Desktop/OSCI/osci/actions/base.py", line 59, in execute return self._execute(**self._process_params(kwargs)) File "/home/sten/Desktop/OSCI/osci/actions/load/load.py", line 34, in _execute return get_github_daily_push_events(day=day) File "/home/sten/Desktop/OSCI/osci/crawlers/github/gharchive.py", line 34, in get_github_daily_push_events DataLake().landing.save_push_events_commits(push_event_commits=push_events_commits, date=day) File "/home/sten/Desktop/OSCI/osci/datalake/local/landing.py", line 42, in save_push_events_commits log.info(f'Push events commits df info {get_pandas_data_frame_info(df)}') File "/home/sten/Desktop/OSCI/osci/utils.py", line 46, in get_pandas_data_frame_info df.info(buf=buf) File "/home/sten/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 2497, in info mem_usage = self.memory_usage(index=True, deep=deep).sum() File "/home/sten/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 2590, in memory_usage result = Series(self.index.memory_usage(deep=deep), index=["Index"]).append( File "/home/sten/.local/lib/python3.8/site-packages/pandas/core/series.py", line 305, in __init__ data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True) File "/home/sten/.local/lib/python3.8/site-packages/pandas/core/construction.py", line 465, in sanitize_array subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype) File "/home/sten/.local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1452, in construct_1d_arraylike_from_scalar subarr = np.empty(length, dtype=dtype) TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

Would docker be a stable environment to run? My aim is to count github conributions based on some email-regexps.

Thanks!

@cm-howard cm-howard added the question Further information is requested label Jul 12, 2023
@cm-howard
Copy link
Collaborator

Thanks for raising this @StenGruener.

I've assigned to @vlad-isayko who will look into this and advise soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants