Skip to content

feat(ai) adding image tagging and translations #30287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

wezell
Copy link
Contributor

@wezell wezell commented Oct 8, 2024

ref: #30284

This pull request introduces two new classes, AIVisionAPI and OpenAITranslationService, to enhance the AI capabilities in the dotCMS project. The AIVisionAPI class provides methods for tagging images and adding alt text, while the OpenAITranslationService class offers methods for translating content and handling translation parameters.

New AI Vision API:

  • dotCMS/src/main/java/com/dotcms/ai/api/AIVisionAPI.java: Added a new interface AIVisionAPI with methods for tagging images, adding alt text, and reading image tags and descriptions. ([dotCMS/src/main/java/com/dotcms/ai/api/AIVisionAPI.javaR1-R65](https://github.com/dotCMS/core/pull/30287/files#diff-57d9ce37bafde56eb885af1f975012f522647ef92554c2d97512da18bf322fb0R1-R65))

New AI Translation Service:

  • dotCMS/src/main/java/com/dotcms/ai/api/OpenAITranslationService.java: Added a new class OpenAITranslationService that extends AbstractTranslationService and provides methods for translating strings and content, setting service parameters, and parsing AI responses. ([dotCMS/src/main/java/com/dotcms/ai/api/OpenAITranslationService.javaR1-R328](https://github.com/dotCMS/core/pull/30287/files#diff-32212827a38e2210f004caac255ef701e833ac00077e894994a934c08c59b647R1-R328))

@wezell wezell linked an issue Oct 8, 2024 that may be closed by this pull request
Copy link

Quality Gate failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube

Catch issues before they fail your Quality Gate with our IDE extension SonarLint SonarLint

Copy link

github-actions bot commented Nov 8, 2024

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Nov 8, 2024
Copy link

This PR was closed because it has been stalled with no activity.

@github-actions github-actions bot closed this Nov 15, 2024
@wezell wezell reopened this Mar 20, 2025
@wezell wezell marked this pull request as ready for review March 20, 2025 02:29
import java.util.Optional;
import java.util.concurrent.TimeUnit;

public interface AIVisionAPI {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc


static final String AI_VISITON_TAG_AND_ALT_PROMPT_TEMPLATE = "AI_VISITON_TAG_AND_ALT_PROMPT_TEMPLATE";

static final Lazy<AIVisionAPI> instance = Lazy.of(com.dotcms.ai.api.OpenAIVisionAPIImpl::new);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think future forward we have to have such as a factory or so, to be able to select the vendor and model depending on user parameters, but it is ok by now

boolean tagImageIfNeeded(Contentlet contentlet);


boolean addAltTextIfNeeded(Contentlet contentlet);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc me

import java.util.stream.Collectors;
import org.apache.velocity.context.Context;

public class OpenAITranslationService extends AbstractTranslationService {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc me


public class OpenAITranslationService extends AbstractTranslationService {

static final String AI_TRANSLATION_SYSTEM_PROMPT = "AI_TRANSLATION_SYSTEM_PROMPT";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these seems to be a good candidate to be on super classes or interfaces

static final String AI_TRANSLATION_TEMPERATURE ="AI_TRANSLATION_TEMPERATURE";
static final String AI_TRANSLATION_RESPONSE_FORMAT ="AI_TRANSLATION_RESPONSE_FORMAT";

static int MAX_LANGUAGE_VARIABLE_CONTEXT = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

such as a config value?

if (!translationKeysJSON.isEmpty()) {
systemContext.put("translationKeys", translationKeysJSON.toString());
}
String systemPrompt = Try.of(() -> VelocityUtil.eval(systemPromptTemplate, systemContext))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompting will be more common in the future, a Service to encapsulate their generation wouldnt be bad

@Override
public Contentlet translateContent(Contentlet contentlet, Language targetLanguage, List<com.dotmarketing.portlets.structure.model.Field> oldFields, User user)
throws TranslationException {
Language sourceLang = APILocator.getLanguageAPI().getLanguage(contentlet.getLanguageId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method looks ok but it is a little bit longer, would be nice to make a refactor to split in into smaller methods

import java.util.stream.Collectors;
import org.apache.velocity.context.Context;

public class OpenAIVisionAPIImpl implements AIVisionAPI {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc me


static final ImageFilterExporter IMAGE_FILTER_EXPORTER = new ImageFilterExporter();

static final Cache<String, Tuple2<String, List<String>>> promptCache = Caffeine.newBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the PromptService may handle cache or so


private Optional<String> getSha256(File imageFile) {
try {
var md = MessageDigest.getInstance("SHA-256");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encryptor.Hashing.sha256() does something similar to

return Optional.empty();
}

final Context ctx = VelocityContextFactory.getMockContext(contentlet, APILocator.systemUser());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so fan of the Mock Context I would prefer that the factory realized if a request is tied in the current call and retrieve the full one, however I do not there are not too many fans of ThreadLocal approaches


@Override
public void init() {
APILocator.getLocalSystemEventsAPI().subscribe(LISTENER);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

import java.util.Optional;


public class OpenAIImageTaggingContentListener implements ContentletListener<Contentlet> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc


public class OpenAIImageTaggingContentListener implements ContentletListener<Contentlet> {

AIVisionAPI aiVisionAPI = AIVisionAPI.instance.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be via CDI or at least APILocator

import java.util.Optional;
import java.util.Properties;

public class AIUtil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc me

private static final Properties properties;
static {
properties = new Properties();
try ( InputStream in = AIUtil.class.getResourceAsStream("/com/dotcms/ai/prompts/" + PROPERTY_FILE_NAME)){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be part of the PromptService / API

}

public static Map<String, Secret> getSecrets(String hostId) {
Host host = Try.of(() -> APILocator.getHostAPI().find(hostId, APILocator.systemUser(), true)).getOrNull();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think secrets that contains private information should be destroyed at least at the end of the request

import java.util.Map;
import java.util.Optional;

public class OpenAIVisionAutoTagActionlet extends WorkFlowActionlet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc

public class OpenAIVisionAutoTagActionlet extends WorkFlowActionlet {

private static final long serialVersionUID = 1L;
AIVisionAPI aiVisionAPI = AIVisionAPI.instance.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in the CDI or APILocator

@@ -0,0 +1,21 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to develop something to handle this thing in a standard way (dotCMS way) such as dotConnect or so, otherwise we will have N ways to do the same in a couple years

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bring AI tagging,Alt descriptions and translations into core
3 participants