Skip to content

animeshsingh/mlx

 
 

Repository files navigation

Build Status CII Best Practices Slack

Machine Learning eXchange (MLX)

Data and AI Assets Catalog and Execution Engine

Allows upload, registration, execution, and deployment of:

  • AI pipelines and pipeline components
  • Models
  • Datasets
  • Notebooks

Additionally it provides:

1. Deployment

For a simple up-and-running MLX with asset catalog only, we created a Quickstart Guide using Docker Compose.

For a full deployment, we use Kubeflow Kfctl tooling.

2. Access the MLX UI

  1. By default the MLX UI is available at :30380/os

To find the public ip of a node of your cluster

kubectl get node -o wide

Look for the ExternalIP column.

  1. If you are on a openshift cluster you can also make use of the IstioIngresGateway Route. You can find it in the OpenShift Console or in the CLI
oc get route -n istio-system

3. Import Data and AI Assets in MLX Catalog

Import data and AI assets using MLX's catalog importer

4. Usage Steps

  1. Pipelines

  2. Components

  3. Models

  4. Notebooks

  5. Datasets

5. Delete MLX

  • Delete MLX deployment, the KfDef instance
kubectl delete kfdef -n kubeflow --all

Note that the users profile namespaces created by profile-controller will not be deleted. The ${KUBEFLOW_NAMESPACE} created outside of the operator will not be deleted either.

  • Delete Kubeflow Operator
kubectl delete -f deploy/operator.yaml -n ${OPERATOR_NAMESPACE}
kubectl delete clusterrolebinding kubeflow-operator
kubectl delete -f deploy/service_account.yaml -n ${OPERATOR_NAMESPACE}
kubectl delete -f deploy/crds/kfdef.apps.kubeflow.org_kfdefs_crd.yaml
kubectl delete ns ${OPERATOR_NAMESPACE}

6. Troubleshooting

  • When deleting the Kubeflow deployment, some mutatingwebhookconfigurations resources are cluster-wide resources and may not be removed as their owner is not the KfDef instance. To remove them, run following:

     kubectl delete mutatingwebhookconfigurations admission-webhook-mutating-webhook-configuration
     kubectl delete mutatingwebhookconfigurations inferenceservice.serving.kubeflow.org
     kubectl delete mutatingwebhookconfigurations istio-sidecar-injector
     kubectl delete mutatingwebhookconfigurations katib-mutating-webhook-config
     kubectl delete mutatingwebhookconfigurations mutating-webhook-configurations
     kubectl delete mutatingwebhookconfigurations cache-webhook-kubeflow
  • If you don't see any sample pipeline or receive Failed to establish a new connection messages. It's because IBM Cloud NFS storage might be taking too long to provision which makes the storage and backend microservices timed out. In this case, you have to run the below commands to restart the pods.

     # Replace kubeflow with the KFP namespace
     NAMESPACE=kubeflow
     kubectl get pods -n ${NAMESPACE:-kubeflow}
     kubectl delete pod -n ${NAMESPACE:-kubeflow} $(kubectl get pods -n ${NAMESPACE:-kubeflow} -l app=ml-pipeline | grep ml-pipeline | awk '{print $1;exit}')
     kubectl delete pod -n ${NAMESPACE:-kubeflow} $(kubectl get pods -n ${NAMESPACE:-kubeflow} -l app=ml-pipeline-persistenceagent | grep ml-pipeline | awk '{print $1;exit}')
     kubectl delete pod -n ${NAMESPACE:-kubeflow} $(kubectl get pods -n ${NAMESPACE:-kubeflow} -l app=ml-pipeline-ui | grep ml-pipeline | awk '{print $1;exit}')
     kubectl delete pod -n ${NAMESPACE:-kubeflow} $(kubectl get pods -n ${NAMESPACE:-kubeflow} -l app=ml-pipeline-scheduledworkflow | grep ml-pipeline | awk '{print $1;exit}')

    Then you can redeploy the bootstrapper to properly populate the default assets. Remember to insert the IBM Github Token if you want to retrieve any asset within IBM Github.

     vim bootstrapper/bootstrap.yaml # Insert the IBM Github Token
     kubectl delete -f bootstrapper/bootstrap.yaml -n $NAMESPACE
     kubectl apply -f bootstrapper/bootstrap.yaml -n $NAMESPACE
  • Additional troubleshooting on IBM Cloud is available at the wiki page.

Join the Conversation

About

Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 73.4%
  • TypeScript 23.3%
  • Shell 1.9%
  • CSS 1.1%
  • HTML 0.1%
  • Dockerfile 0.1%
  • Makefile 0.1%