-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Q: ML.NET within SQLCLR #2571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@grahamehorner From an architecture perspective it doesn't sound like a good idea. You are talking about using expensive SQL CPU resources to do training, which often uses extreme amounts of CPU. Some ML algorithms use GPUs instead of CPUs if they are available, in future we will likely have algorithms that depend on Quantum co-processors (at least during training). You may want to build pipelines that do things that are not allowed in the SQLCLR sandbox (like file I/O?). |
Thanks for the reply, I have a requirement where models need to be trained near the data and the data is encrypted and held within sql
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: Chris Hewitt <[email protected]>
Sent: Sunday, February 17, 2019 11:50:10 PM
To: dotnet/machinelearning
Cc: Grahame Horner; Mention
Subject: Re: [dotnet/machinelearning] Q: ML.NET within SQLCLR (#2571)
@grahamehorner<https://github.com/grahamehorner> From an architecture perspective it doesn't sound like a good idea. You are talking about using expensive SQL CPU resources to do training, which often uses extreme amounts of CPU. Some ML algorithms use GPUs instead of CPUs if they are available, in future we will likely have algorithms that depend on Quantum co-processors (at least during training). You may want to build pipelines that do things that are not allowed in the SQLCLR sandbox (like file I/O?).
If you can build a training pipeline in .NET Standard (1.6 is supported in SQLCLR?) then it could work, but why not just do the training outside - it will be much easier. You could do the predictions in SQLCLR in some cases (depending on the pipeline code being SQLCLR compatible).
SQLCLR is a sandbox environment with limitations on how you can interact with resources outside it. Training could work but would require the pipeline(s) and algorithm(s) to only do things allowed by the sandbox.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#2571 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ADvjzYSFkYXpP_HbzVEB2ni3Krgmaat_ks5vOeqygaJpZM4a-M-5>.
CAUTION: This email and any attachments transmitted with it contain information that is CONFIDENTIAL and may be privileged or otherwise protected from disclosure. If you are not the intended recipient please notify us immediately and delete all copies of this email message and any attachments transmitted with it from your system. If you are not the intended recipient, any use, disclosure or copying of this email message and any attachments transmitted with it is strictly prohibited. This email message and any attachments transmitted with it cannot be guaranteed to be secure or error free as information could be intercepted, corrupted, lost, destroyed or incomplete. Although we routinely screen for viruses, it makes no representation or warranty as to the absence of viruses in this email or any attachments. The views expressed in this email are those of the sender and not necessarily those of Gabriel Cloud Systems LTD; if you require clarification please contact the sender.
|
@grahamehorner Yeah, that makes sense. I have had similar requirements myself (pre ML.NET). Have you tried to use ML.NET in SQLCLR yet? If so what problems did you encounter? (save me some pain trying it). Have you tried using https://docs.microsoft.com/en-us/sql/advanced-analytics/what-is-sql-server-machine-learning?view=sql-server-2017, or the 2019 version of the same? That's only R or Python so far though and in any case because it's an external sandbox the data will be unencrypted when passed, although staying in the same machine/VM. Some of the diagrams for that tout 'keep your data encrypted' but I don't think that's actually how it works. The subtleties don't always make it to the marketing department of the VLCC :-). It is pretty safe though, I'm not sure how you could hack that without taking the server's O/S first.... The junior equivalent would be to just run your training on the SQL Server machine in a separate process. No extra tech needed and there is no network traffic (using shared memory), so it is as safe as your database server, and arguably as safe as something running in SQL extensibility framework |
@grahamehorner,
|
I am closing for lack of activity, I am also linking this PR that introduces initial specs for a SQL data loader: |
I would like to run/train a ML.NET model from with in SQL server as a SQLCLR; at present the ML.NET is failing with an obsquer error that looks to be related to security, is/will it be possible to run/train ML.NET models from inside SQL stored procedures close to the data source.
The text was updated successfully, but these errors were encountered: