Comment by thundergolfer on 08/10/2021 at 07:20 UTC

2 upvotes, 1 direct replies (showing 1)

View submission: Evolving Reddit’s ML Model Deployment and Serving Architecture

Nice write up, appreciate the level of detail.

If I understand correctly, you can look at `model.schema` in the YAML file, find all the features with `source: Passed`, and build the Thrift RPC interface for a model from that?

I'm also curious what you experience has been with feature transformation code and its integration with this system. I think Tensorflow has good support for pushing feature transform logic into the model graph, such that you don't need to split feature transform logic across the backend<>model boundary.

Replies

Comment by heartfelt_ramen_fog at 08/10/2021 at 21:15 UTC

1 upvotes, 0 direct replies

Thank you for taking the time to read!

If I understand correctly, you can look at `model.schema` in the YAML file, find all the features with `source: Passed`, and build the Thrift RPC interface for a model from that?

So not quite. We don't build the thrift RPC interface for a model dynamically from the schema. We actually use a uniform thrift RPC interface across all models that are deployed so this interface is static and it is essentially (doing a bit of handwaving here) `model.predict(features: FeatureFrame) -> MdoelPredictions`.

The schema is used to fetch feature data from various sources and transform this data into a thrift object we use to represent a data table called a `FeatureFrame`. The `source: Passed` bit in the schema tells the inference service that we should expect that this specific feature is passed in via the request and not coming from one of our centrally managed feature DB. Let me know if I'm explaining that clearly and if you have any other questions!

I'm also curious what you experience has been with feature transformation code and its integration with this system. I think Tensorflow has good support for pushing feature transform logic into the model graph, such that you don't need to split feature transform logic across the backend<>model boundary.

Yes, so I would say we try to encourage pushing this into the model as much as possible but we also want to support additional frameworks where this is not as feasible. Today we do have a few features that require request time transformations. This is something we think a lot about since these need to be coded up into the inference service and as the blog calls out one of our big priorities is drawing a clear boundary between "ml platform code" and "code that gets deployed onto the platform". We are starting work on what we are calling `gazette-feature-service` which will split out the feature serving responsibilities from the inference service and some sort of plugin architecture for supporting these request time transformation is something we would love to explore. Have you had success with any specific approach to this type of transformation code?