Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement in-memory search indexer #4041

Closed
akhenry opened this issue Jul 19, 2021 · 21 comments · Fixed by #4527
Closed

Reimplement in-memory search indexer #4041

akhenry opened this issue Jul 19, 2021 · 21 comments · Fixed by #4527

Comments

@akhenry
Copy link
Contributor

akhenry commented Jul 19, 2021

Open MCT object providers may define an optional search function if the associated persistence store supports search. eg. our Couch DB object provider will use Couch DB's find API to return objects matching the provided search term.

Open MCT also implements a fallback search provider for any objects whose associated object provider does not implement search. This fallback search provider is implemented as an "in-memory" search index, which tracks a subset of model attributes for all domain objects whose object provider does not implement search, so that those attributes can be subsequently searched.

This in-memory search indexer is currently implemented as an Angular-managed service, and should be re-implemented as a plugin (under /src/plugins) that is installed by default. There are actually two versions of the legacy indexer (the GenericSearchWorker, and the BareBonesSearchWorker), but only the BareBonesSearchWorker is actually in use, so the GenericSearchWorker should not be re-implemented.

The legacy Angular managed service is invoked from the ObjectAPI's search function. This logic can be retained, but it should call the new search indexer directly rather than invoking a function on the LegacyPersistenceAdapter (which is part of the adapter layer into the legacy API that will soon be removed).

Note that only the search provider needs to be re-implemented, its worker can be brought across as-is, and the rest of the legacy search bundle (which includes a SearchAggregator, some legacy UI, and associated tests) should be removed from the repository.

@scottbell
Copy link
Contributor

Should the GenericSearchWorker be deleted too?

@scottbell
Copy link
Contributor

We're going to instead add an ES6 class in src/api/objects with a search function. This will be directly called by the ObjectAPI's search. We can also can rid of the ObjectAPI's search reformatting of results after a hit.

@scottbell
Copy link
Contributor

scottbell commented Jul 29, 2021

We'll remove the generic worker registration with GenericSearchProvider. Don't re-implement the worker service.

@scottbell
Copy link
Contributor

Register objects for mutation observers when they're indexed. On mutation, reindex.

@scottbell
Copy link
Contributor

Fro GenericSearchProvider.index, go to RootRegistry to get root objects, from the IDs of the root objects, get the namespace, get the Object provider, check if search implemented, then index if not implemented. To index, see MutableDomainObject line 205

@scottbell
Copy link
Contributor

Need to get with Vista folks to test on large object corpus.

@scottbell
Copy link
Contributor

Try to consolidate GenericSearchWorker and BareBonesSearchWorker. Also try to remove some of the indirection in GenericSearchProvider too.

@scottbell
Copy link
Contributor

Make sure class doesn't start indexing until openmct.on('start) has fired. Don't start indexing in constructor.

@scottbell
Copy link
Contributor

Note I've been working on ticket in this branch:
https://github.com/nasa/openmct/compare/master...scottbell:mct4041?expand=1

@scottbell
Copy link
Contributor

An update where I am on this:

  • I've got a new shared worker going. I had to remove a lot of angular services it was depending on. I used the CouchDB worker as an exemplar.
  • It's indexing properly, and searching, but the Promise structure needs to be reworked not to use angular's $q & deferred services. I'm implementing this now.

@akhenry
Copy link
Contributor Author

akhenry commented Nov 29, 2021

Sounds like great progress, thanks for sharing @scottbell.

@scottbell
Copy link
Contributor

Searching is now working, but I had some trouble with the way the shared worker was using its data structures. Namely, in the current way search works, its worker just has an array of items. On mutation, "indexing" works by just pushing the new version of the model to the array. Consequently, if I rename a clock from "Foo Clock" to "Bar Clock", "Foo Clock" hangs around. Searching for "Foo" will return "Bar Clock". Is this what we want @akhenry ? I was instead going to use a dictionary data structure instead, and have the mutation replace the existing item, thus searching "Foo" would return nothing as no item has that name anymore.

One other question, previous versions of the indexed data structure in the workers didn't keep the whole model of the domain objects, only the name/type/id. Due to this, there's a lot of looking up of models when returning results. Do we want to continue this? It's a size/speed trade off of course, but if the size is too big, maybe we only want to keep the surface level stuff (i.e., name/type/id) as we previously did.

@scottbell
Copy link
Contributor

scottbell commented Dec 1, 2021

Talking to @akhenry , shallow memory is desired.
@akhenry will address after looking at the 👆 whether we should use a dictionary/map, or keep the array.

@akhenry
Copy link
Contributor Author

akhenry commented Dec 1, 2021

@scottbell Ah, I see. No, pushing mutated models to an array doesn't make any sense to me. Storing the models in a map keyed on id makes a lot more sense to me.

For search itself we need an array like structure because we have no choice but to do an exhaustive scan, but we could just use Object.values(indexMap)

@scottbell
Copy link
Contributor

Great! That’s roughly what I’m doing in my branch (sans the namespace, but will add that). Thank you for the feedback!

@scottbell
Copy link
Contributor

@akhenry draft PR opened here:
#4527

@scottbell
Copy link
Contributor

@akhenry @unlikelyzero Ah! I see the issue. Previously we had just used Workers for search. These can be loaded using Blobs/createObjectURL
You can't do that with SharedWorkers, which is what this PR is using, all contexts need to load from the same origin, hence the URL.

I guess this is a performance question though - should we be using Shared Workers here, or switch back to Workers? The advantage would be that several tabs could (theoretically) share indexing, which seems useful.

@scottbell
Copy link
Contributor

Discussed a bit with @akhenry and we'll keep SharedWorkers, but need to make sure it works without SharedWorkers so Safari on iOS will still function.

@unlikelyzero
Copy link
Contributor

@scottbell would you mind adding testing notes to this one?

@unlikelyzero unlikelyzero added the needs:test instructions Missing testing notes label Jan 4, 2022
@scottbell
Copy link
Contributor

scottbell commented Jan 4, 2022

To test ticket:

  • Add some objects, ensure sure can find them via their name
  • Open multiple tabs of OpenMCT, ensure searching for objects works
  • If possible, try testing with a lot of objects and ensure search performs well
  • Delete an object, ensure search results don't include deleted object
  • Test this in Safari too (as it lacks SharedWorkers)

In general, search should work as did before.

@akhenry
Copy link
Contributor Author

akhenry commented Jan 24, 2022

Verified Fixed! Works great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants