a better way to search code

Web | Docs | Status | NPM | AUR

Note: All source files are protected by the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license, included in this directory. Users of this source code (located in this current directory and any sub-directories) may not violate the terms of said license.

Note: Netlify "failed" means either deployment failed or no applicable changes found in last commit.

things we need to do for java specifically right now for a one language demo :)

go through search line by line, make sure it's calling nlp correctly and getting good outputs

note - this requires updating the parameters for elasticsearch & the overall query structure (a little)

go page by page and fix all the styling stuff in web. List at the bottom of readme. If there are extraneous requests, clean it up using field resolvers
test full search stack, index a lot of files

add limit to number of characters in file for parsing (we don't care about files that are super big because that'll break search)
elasticsearch requests are not typed rn, which leads to a bunch of random errors. elasticsearch builder should help to fix that
debug with web & api
refactor portions of api graph model to include field resolvers as needed
in docs, development index run the curl command to get an ip address and then use that to view the output in your browser

how to do this

Need to spin up all of the produciton / dev servers and get data into elasticsearch (cloud develoepment servers)
This means cloning random java repositories and indexing them under the dev user (no premissions, just a login we share, we dont want everything controled by one account)
Search page - keep running until it works
How do you run the website from cloud nine and view the output and how do you access the graphql playground from cloud 9 - Done
we should write a guide on how to index things using the cli

http://54.144.74.130:8000

things that we want to do

refactor api graph model ( what is an api graph model ) graphql => makes a graph model
- everything is flat rn, which forces us to create extra requests to the api from web & all our other stuff
- we can keep the flat attributes with the object id attributes, but add field resolvers to add the graph manually
- project -> repository -> folder -> _ file _
- project -> field resolver for repo, folder for files, etc
- elasticsearch requests are not typed rn, which leads to a bunch of random errors. elasticsearch builder should help to fix that
- sync between elastic and the database easily
- simplify the logic for getting the data, ideally done during dev of website
- simpify existing codebase so that everything is less fragmented
- nested fields in elastic query are slow so how do we handle that
- maybe utilize a similar standard output structure to github's semantic ast trees

ensted is slow becuase we eprform multi match over each field and theyre all trigrams so it ends up beign a lot fo computation we may need to do optimizaiton of elastic past what the out of box functionality is

apprently you can convert from h5 to ast (we use h5 for antlr4 currently)

https://github.com/tree-sitter/tree-sitter
https://github.com/github/semantic#technology-and-architecture

Elatic nested fields are: comments variables imports functions classes each nested field ahs a parent as laid out in nested obejct [id parent and location] in elastic they are handled differently this is how we highlight the individual matching object instead of the whole file

think about optimizing the elastic fields for computation time instead of disk footprint possibly more than one search type (classe, functions, libraries, etc...)

Everything will still be stored in a flat fashion in the database, just with graphql we will simulate making it a phat object for ease of query

This is what flat looks like: { repostiory this id array of ids for children

    file 
        this id
        array of ids for each type of child 
        parent id 
    
    ...
    
}

This is what phat looks like: repository -> [folder] -> [file] -> [class] -> [class functions] -> [standalone funcitons] -> [imports] -> package path

A field resolver is a block of code which runs when you query for a certain field -> basically lazy evaluation of a field in your data object so we want to write lazy resolvers for each layer of this query

frontload the keyword search with keywords extracted from file documentaitno and definition names and use that to filter out the files along with public access need a compressed representation of data +

Websites that need to be checked. Most important to least important

Login Search Repository Account Profile Projects Repositories About Explore

Bugs When indexing with the CLI and making a new repository, it will throw an error that the repo does not exist and then if you run it again the repo will show up and work as expected SOEMTIMES seems like a race condition Argument Validation Error : what is it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

things we need to do for java specifically right now for a one language demo :)

how to do this

things that we want to do

Websites that need to be checked. Most important to least important

Files

README.md

Latest commit

History

README.md

File metadata and controls

things we need to do for java specifically right now for a one language demo :)

how to do this

things that we want to do

Websites that need to be checked. Most important to least important