이미지에서 프롬프트 추출하기 in Comfyui

이미지에서 프롬프트 추출하기 in Comfyui - QWEN3-VL

이미지 또는 사진에서 Prompt(프롬프트)를 추출하여 사용하기

네이버, 구글 또는 핀터레스트 등의 사이트에서 마음에 드는 사진을 보고 비슷한 이미지를 만들고 싶을때, 그 사진을 보고 직접 그것을 프롬프트로 만들기가 어려운 일이다. 만들었다고 해도 그것을 이용하여 이미지를 생성해 보면 원하는 이미지와는 동 떨어진 이미지가 만들어지기 쉽다.

그래서 보통은 나 를 이용하여 이미지를 업로드하고 프롬프트를 만들어 달라고 하여 추출한다. 나도 이런 방식으로 종종 프롬프트를 추출하여 사용하곤 한다.

여기에서는 외부 도움을 받아 추출하는 대신 내 컴퓨터에서 추출하여 사용하는 방법이다.

1. 이미지에서 프롬프트 추출 예

로 만든 이미지

위의 이미지에서 프롬프트를 추출 한다면......

chatGPT : A soft, elegant portrait of a young woman sitting on the floor, painting flowers with a brush and a palette. She wears a crisp white shirt with rolled-up sleeves and fitted blue jeans. Her long dark hair flows naturally as she focuses calmly on her artwork. A vase filled with colorful roses stands beside her. The scene has a warm beige background, smooth lighting, clean details, and a serene, gentle atmosphere. Ultra-high-definition, photorealistic illustration, soft shadows, pastel tones, refined skin texture, delicate rendering.
GEMINI : A beautiful, elegant East Asian woman sitting barefoot on the floor, wearing a loose white button-down shirt and form-fitting blue jeans. She is holding a fine paintbrush in her right hand, poised to touch a palette of watercolors held in her left hand. Next to her is a white ceramic vase with a bouquet of realistic, pastel-colored flowers (roses, perhaps some greenery). The setting is a minimalist studio with a solid, neutral beige or light tan background, emphasizing soft, natural studio lighting. Highly detailed, photorealistic, cinematic light, 8k.
Qwen3-VL : A serene, softly lit studio portrait of an elegant young woman with long dark hair sitting cross-legged on a warm beige floor, wearing a crisp white button-up shirt and slim-fitting blue jeans; she holds a paintbrush in one hand and a small palette with vibrant colors in her other — poised as if about to sketch or paint — beside her is a simple cream-colored vase filled with delicate flowers including roses and eucalyptus leaves; gentle natural light illuminates her focused expression and soft features against a minimalist background, evoking calm creativity and artistic grace. Style: Photorealistic digital illustration / Soft lighting / Minimalist composition. Mood: Calm, contemplative, creative focus. Tools/Style Note (for Z-Image): Use high-resolution rendering mode + realistic skin texture + subtle depth-of-field blur for backgrounds 이것보다 훨씬 더 상세하게도 추출하여 준다.

2. 필요한 모델과 노드들

필요한 모델은 QWEN3-VL- xxxxx 들인데 이것들을 다운받아 먼저 설치해도 되지만, 아래 노드들을 설치하고 실행하면 맨 처음에 필요한 모델이 자동적으로 다운 받아 설치된다.
필요한 노드 설치 : comfyui manager 에서 설치한다.
comfyui-QwenVL 노드
다른 비슷한 노드들도 있는데 내가 설치한 노드는 위의 노드들이며, 이것만 있으면 된다.

3. Workflow

Workflow

model_name : Qwen3-VL 중 자신의 컴퓨터 성능에 따라 선택
quantization : 4-bit, 8-bit, None(16) 에서 컴퓨터 성능에 따라 모델과 관련하여 선택
preset_prompt : 프롬프트를 어느 정도 상세히 분석할 것인가 를 선택하고 그 아래 입력란에 요청할 사항을 프롬프트로 입력한다.

예) You are a professional photographer. Analyze the photo in detail, including the subject, clothing, and pose. Pay particular attention to the clothing.
당신은 전문 사진작가입니다. 피사체, 의상, 포즈 등 사진을 자세히 분석하세요. 특히 의상에 더 주의를 기울이세요.

max_tokens : 분석하여 추출할 프롬프트의 최대 길이. 이 숫자가 클수록 상당히 긴 분석된 프롬프트를 만들어 준다.
기타 다른것은 디폴트값 그대로 사용하였다.

4. 프롬프트 추출 실행

이미지를 업로드하고
Image upload
Model : Qwen3-VL-8B-instruct
quantization : 8-bit(Balanced)
preset-prompt : Detailed Description ( You are a professional photographer. Analyze the photo in detail, including the subject, clothing, and pose. Pay particular attention to the clothing. )
max-tokens : 512
실행후 추출 프롬프트 : 상당히 길게 그리고 의상은 매우 상세히 서술하여 만들어 주었다.
소요시간 : 74초
추출된 프롬프트를 사용하여 Z-Image-Turbo 모델로 생성한 이미지
Z-Image-Turbo
Z-Image-Turbo 모델은 사용해 볼수록 생각보다 훨씬 더 쓸만한 이미지 생성형 AI 모델이다. 만든 이미지가 원본의 느낌이 비슷하게 많이 보인다.

5. 결론

추출된 프롬프트를 CLIP Text Encode (Prompt) 노드에 바로 연결 사용하면 Z-image-turbo 에서 쉽고 빠르게 비슷한 이미지가 만들어 진다.

처음 이미지에서 추출한 프롬프트로 Z-Image-Turbo 로 만든 이미지

이 글 맨 처음의 원본 이미지와 상당히 비슷하게 만들어 진다.

프롬프트를 만드는데 어려움을 느낀다면 이런 방법으로 마음에 드는 좋은 이미지에서 프롬프트를 추출하여 여러가지 모델로 이미지를 만들어 보면 프롬프트에 좀 더 빨리 익숙해 질 수 있다.

벌써 2025년의 마지막 달 4일입니다. 오늘은 유난히 춥고 눈도 제법 많이 오고 있네요.

이럴때는 몸 사리며 건강 조심해야 합니다.

이 블로그 검색