Home » SQL & PL/SQL » SQL & PL/SQL » Grouping text (11.2.0.3)
Grouping text [message #649456] Sat, 26 March 2016 04:24 Go to next message
Amine
Messages: 376
Registered: March 2010
Senior Member

Hi Gurus,
Thanks for the great job you're doing !

I have a table that contains text.
drop table my_text;
create table my_text 
(
	id_text		 int			,
	text		 varchar2(1000)
)
;

insert into my_text values (100, 'AA BB SS');
insert into my_text values (200, 'FF GG ');
insert into my_text values (300, 'ZZ XX ');


The goal is to group each text to a group of this sort :

drop table grouped_libs
create table grouped_libs
(
	id_group	int	,
	like_word	varchar2(1000)
)
;

insert into grouped_libs values (1, '%AA ');
insert into grouped_libs values (1, '%BB ');
insert into grouped_libs values (1, '%CC ');
insert into grouped_libs values (1, '%EE ');

insert into grouped_libs values (2, '%FF ');
insert into grouped_libs values (2, '%GG ');
insert into grouped_libs values (2, '%HH ');
insert into grouped_libs values (2, '%II ');


The desired result should be like this :

id_text			like_word		id_group
-------			---------		--------
100			AA			1
100			BB			1
200			FF			2
200			GG			2
300			null			null


Thanks in advance,

Amine
Re: Grouping text [message #649462 is a reply to message #649456] Sat, 26 March 2016 06:13 Go to previous messageGo to next message
Solomon Yakobson
Messages: 3275
Registered: January 2010
Location: Connecticut, USA
Senior Member
Something like:

select  id_text,
        like_word
  from  my_text,
        grouped_libs
  where instr(text,ltrim(like_word,'%')) > 0
/

   ID_TEXT LIKE_WORD
---------- ---------
       100 %AA
       100 %BB
       200 %FF
       200 %GG
       300

SQL> 


SY.
Re: Grouping text [message #649465 is a reply to message #649456] Sat, 26 March 2016 19:59 Go to previous message
Barbara Boehmer
Messages: 9097
Registered: November 2002
Location: California, USA
Senior Member
I don't know if what you provided is the actual case or a simplification of something larger and more complex. If this is all there is, then what Solomon provided is a good solution. If it is more complex, such as having more complex things to search for than just a couple of letters, then this looks like a case for Oracle Text document classification. Please see the example below using your simple test data, but without the % signs that are not needed.

SCOTT@orcl> -- tables and data you provided without % signs:
SCOTT@orcl> -- documents:
SCOTT@orcl> COLUMN text FORMAT A50 WORD_WRAPPED
SCOTT@orcl> SELECT * FROM my_text
  2  /

   ID_TEXT TEXT
---------- --------------------------------------------------
       100 AA BB SS
       200 FF GG
       300 ZZ XX

3 rows selected.

SCOTT@orcl> -- rules:
SCOTT@orcl> COLUMN like_word FORMAT A50 WORD_WRAPPED
SCOTT@orcl> SELECT * FROM grouped_libs
  2  /

  ID_GROUP LIKE_WORD
---------- --------------------------------------------------
         1 AA
         1 BB
         1 CC
         1 EE
         2 FF
         2 GG
         2 HH
         2 II

8 rows selected.

SCOTT@orcl> -- Oracle text ctxrule index:
SCOTT@orcl> CREATE INDEX test_idx ON grouped_libs (like_word) INDEXTYPE IS CTXSYS.CTXRULE
  2  /

Index created.

SCOTT@orcl> -- documentation classification based on rules:
SCOTT@orcl> CREATE TABLE classifications
  2    (id_text      NUMBER,
  3  	like_word    VARCHAR2(1000),
  4  	id_group     NUMBER)
  5  /

Table created.

SCOTT@orcl> BEGIN
  2    FOR doc IN
  3  	 (SELECT id_text, text FROM my_text)
  4    LOOP
  5  	 FOR rule IN
  6  	   (SELECT id_group, like_word FROM grouped_libs
  7  	    WHERE  MATCHES (like_word, doc.text) > 0 )
  8  	 LOOP
  9  	   INSERT INTO classifications (id_text, like_word, id_group)
 10  	     VALUES (doc.id_text, rule.like_word, rule.id_group);
 11  	 END LOOP;
 12    END LOOP;
 13  END;
 14  /

PL/SQL procedure successfully completed.

SCOTT@orcl> SELECT * FROM classifications
  2  /

   ID_TEXT LIKE_WORD                                            ID_GROUP
---------- -------------------------------------------------- ----------
       100 AA                                                          1
       100 BB                                                          1
       200 FF                                                          2
       200 GG                                                          2

4 rows selected.


Previous Topic: Column to Row
Next Topic: Can RESULT_CACHE be used in a Package Function?
Goto Forum:
  


Current Time: Sun Jun 30 16:05:15 CDT 2024