Tuesday, May 26, 2009

poorman's connect by -- hirerarchical queries

See http://www.evolt.org/article/Four_ways_to_work_with_hierarchical_data/17/4047/index.html

Four ways to work with hierarchical data

Rated 4.28 (Ratings: 11) (Add your rating)

Log in to add a comment
(16 comments so far)

Want more?

Picture of ghost_inside

Kirby Fling

Member info | Full bio

User since: December 05, 2000

Last login: December 05, 2000

Articles written: 1

Many thanks to RatFace for his article BBS style recursion How-To. I have also recently been wrestling with the concept of displaying heirarchical lists, specifically in a discussion forum tool that I've been writing.

I've spent quite a bit of time working out the data structure and display methods for the concept of heirarchical data and I've come to the conclusion that there are four main programming methods to approach the problem:

  1. Recursion
  2. Stack
  3. Flat Table
  4. Modified Preorder Tree Traversal Algorithm

So let's begin our discussion of each method:

Recursion

To many computer science types, recursion is frequently found to be the most "elegant" method to drill down through a hierarchical list of data.

And I would have to agree with that summation, but then again, that might be my first year of computer science speaking. ("Welcome to CS 101: Recursion until your head explodes")

The reason it is such an "elegant" solution is that you essentually have one function that does all of your data processing, display, and iteration to possible children. You call the function once (with the root of the tree) and everything is done. The added benefit is that you can call the function with any subnode of the tree and it will display the entire subtree without an additional logic or coding effort. In pseudo-code, a hierarchical list display function would look something like this:

function DisplayChildren(uid, indent_level)
SELECT id, (info)
FROM (db)
WHERE ParentID = uid

if recordcount > 0 then
(indent by indent_level) (HTML to display info for this child)
child_id = recordset.id
DisplayChildren(child_id, indent_level+1)
end if
end function

That's really all there is to the recursive method, which is why it's seen as such an elegant solution.

Disadvantages

The problem is that the vast majority of coding solutions we have at our disposal handle recursion very poorly. The idea is that every function (ideally) gets executed in its own memory space. And have to leave cleanup to whatever middleware we are using. So let's do a quick little check on efficiency with arbitrary numbers.

Let's take the following to be true (just for example):

  • Each time we get one record from the DB, it takes n seconds.
  • Each DB connection/disconnection takes n * 2 seconds.
  • Allocating memory for a new incidence of the function takes n/10 * iteration seconds (in ASP, and Cold Fusion, recursive algorithms have demonstrated a logorithmic decay in performance).

So, if we had 1000 messages in our discussion forum, the equation would look something like this: (1000 * n) + (1000 * n * 2) + (n/10 * iteration). Take a look at the following chart for better clarification:

Record    1:   3 * n seconds
Record 250: 29 * n seconds
Record 500: 53 * n seconds
Record 750: 79 * n seconds
Record 1000: 103 * n seconds
----------------------------
Total = approx 50,000 * n seconds

As you can see, that logorithmic decay really ends up biting in the end. And granted, the numbers I used are just for the sake of example, but you can see that there is a fundamental issue of excess over head with the recursion model. It is a very elegant model, it fits in your head very well, and in the code even better, but unless the system you are using is designed for recursion (e.g. scheme, lisp, etc), you will always run into excessive overhead when the numbers start getting huge.

Using a Stack

The stack is usually the second method of approaching large, crazy lists. There are 2 types of stack programming: First In First Out and First In Last Out. When working with heirachcical lists, you have to use the FILO model. The idea is that you create a stack somewhere, which is essentially just a list. Then add the id of the record you are currently working with, and when you are done with that record, you pull it off the list. If that record has any children, then you start working with the first child and put it's id at the end of the list. etc.

Much like the recursion model, the data looks something like this:

Field:Type:Null:Comments:
UIDint, numeric, etc.nothe unique identifier for each record
name, title, etcvarchar(x)yesthe name or title of this record
ParentIDint, numeric, etc.yesthe UID of the parent to this object, NULL for the top level of the tree

But now the question is whether you want to do everything in the middlewareor closer to the data in an SQL stored procedure.

Advantages to running the stack model in your middleware:

  1. you can display your HTML inside each iteration.
  2. easier to code/read/modify?

Disadvantages:

  1. just as many DB connections being created as in the recursive model

Advantages of running the stack with a stored procedure:

  1. executes closer to the data, so it's faster
  2. gives you a compiled temp table that needs little effort in the middleware
  3. only one connection to the database

Disadvantages:

  1. you might not be able to run stored procedures on your DB system
  2. you might not have access to create stored procedured on your DB system
  3. more difficult to code/read/manage it it's remote and in a different language

Here is the code for a hierarchical stack algorithm that is outlined in the Microsost SQL 6.5 documentation as an example:

CREATE PROC expand (@current char(20)) AS
SET nocount on
DECLARE @level int, @line char(20)
CREATE TABLE #stack (item char(20), level int)
INSERT INTO #stack VALUES (@current, 1)
SELECT @level = 1
WHILE @level > 0
BEGIN
if EXISTS (SELECT * FROM #stack WHERE level = @level)
BEGIN
SELECT @current = item
FROM #stack
WHERE level = @level
SELECT @line = space(@level - 1) @current
PRINT @line
DELETE FROM #stack
WHERE level = @level
AND item = @current
INSERT #stack
SELECT child, @level 1
FROM hierarchy
WHERE parent = @current
if @@rowcount > 0
SELECT @level = @level 1
END
else
SELECT @level = @level - 1
END

You can modify the concept in this code to fit your particular middleware.

Now let's take a look at the theoretical efficiency numbers like we did in therecursive model:

Let's take the following to be true (just for example):

  • Each time we get one record from the DB, it takes n seconds.
  • Each DB connection/disconnection (from middleware) takes n * 2 seconds.
  • Time to execute the list/stack operations is almost nonexistant
  • Time to create the temporary table in the stored procedure takes n / 2 seconds

So, we have the same 1000 messages in our discussion forum, the equation would look something like this:

Running the Stack with middleware: (1000 * n) + (1000 * n * 2) = 3000 * n

Running the Stack with a stored procedure: (1000 * n) + (2 * n) + (1000 * (n / 2)) = 1502 * n

Flat Table Model

While I was building my discussion forum, I immediately went for the recursive model to get my data and display it on the screen. I successfully built my forum in both Cold Fusion and ASP. It worked in both, but I wasn't very happy with the performance of either. I decided to investigate other ways of turning a sequence of parent-child relationships into a hierarchical list of data. I started playing with the stack model, and eventually discovered the stored procedure method, and while I was working with that, I had this little flash of insight:

A flat table will always yield the fastest query.

So I started asking myself "How do I turn the data I've got (or am about to get) into a flat table?" As it turns out, it's really quite easy. Take the example of a discussion forum. What information is really on the page? If the information looks like this:

"hello world"johnDoe12/2/00
"Re: hello world"janeDoe12/4/00
"How is your foobar?"johnDoe12/3/00
"my foobar is fine"sysadmin12/2/00
"hello yourself"jackDoe12/2/00
"my app RuLeZ"idiotboy12/1/00
"learn how to spell" jacksnot12/2/00

Everybody sees that there is a subject, author and date. But the indenting on the left is usually seen as the parent-child relationship. "my foobar is fine" is a child of "How is your foobar?" etc. But you can also look at it apart from the parent-child relationship and see it as:

  1. the order in which to be displayed
  2. the level to be indented

So, we can contruct the data schema to look something like this:

Field:Type:Null:Comments:
UIDint, numeric, etc.nothe unique identifier for each record
name, title, etcvarchar(x)yesthe name or title of this record
ParentIDint, numeric, etc.yesthe UID of the parent to this object, NULL for the top level of the tree
rank (or display_order)int, numeric, etc.yesthe order to display all records
indent_levelint, numeric, etc.yeshow much to indent this item

Now, how do we work with this data structure once it's built? The selection of the data is easy:

function DisplayChildren()
SELECT id, (info), indent_level
FROM (db)
ORDER BY rank

while the recordset isn't empty {
(indent by indent_level) (HTML to display info for this child)
}
end function

But, as easy as that is, the difficulty is now shifted to the point of insertion. The indent level is easy, it is always (parent.indent_level + 1). The display_order (or rank) is a little more challenging. The rank of the new child needs to be greater than it's parent, but less than any of it's children or the next sibling at the same level. In the example above a new child of "hello world" should have a rank of 2. But if they were all ranked from 1 to 7 to begin with, that means that you need to increment the rank of all messages whose rank is greater than 1. Which just happens to be the parent of the child we are inserting. So you end up calling a stored procedure that looks something like this:

CREATE PROC increment_rank (@start_rank int) AS
BEGIN
UPDATE (table_name)
SET rank = rank + 1
WHERE rank > @start_rank
END

Note: you can do this in middleware if you need, but it's such a simply query that will gain such benefit from a pre-compiled query path that a stored procedure is the best way to go.

This solution ends up with the biggest database hit at the point of insertion, but even that hit isn't terribly bad, and the lessening of the hit for each view of the folder is VERY significant. Let's go back to our efficiency numbers for a moment:

Let's take the following to be true (just for example):

  • Each time we get one record from the DB, it takes n seconds.
  • Each DB connection/disconnection (from middleware) takes n * 2 seconds.

In order to view all of of our 1000 example messages, the equation looks like this: (1000 * n) + (2 * n) = 1002 * n

But in order to be fair, we also have to look at our numbers for the insertion of a given message...

Let's take the following to be true (just for example):

  • Each time a record is inserted, it takes n * 2 seconds.
  • Each time a record is updated, it takes n * 2 seconds.

So, for both the recursive and stack models, the cost of insertion is simply: (connection + insertion) n * 2 + n * 2 = n * 4

But for the flat table model, the cost of insertion is something akin to: (connection + updating 1-1000 records + insertion) n * 2 + (n * 2 * x) + n * 2 = [6n to 1004n]

As you can see, the cost of insertion is MUCH higher for the flat table model, but even the highest cost of insertion is not as high as the fastest viewing method that we've discussed thus far (the stored procedure/stack model @ 1502 * n). And the viewing speed is 30% faster. So if you expand the numbers to include some kind of usage metric, with an estimated 8,000 page views a month, and an estimated 300 messaged inserted a month. The flat table model is the hands down winner..... for a discussion forum....

Disadvantages

Which brings us to our disadvantages to the flat table model. It isn't easily ported to a lot of other uses. With a discussion forum, you almost always want it sorted hierarchically, then by date descending. And the rank-increment method works perfectly for this application. If you wanted to sort it by date ascending, it is still relatively easy (insert the child before the next message with the same parent id). But for other large lists of information that needs to be displayed hierarchically, it's very difficult to get them to display properly without some kind of re-ordering interface that you can expose to whoever is in charge of the information. In my experience, that is almost always the case anyway so the flat table model is once again a viable option for your hierarchical list pages.

A Modified Preorder Tree Traversal Algorithm

To be completely honest, I've never used this method for working with hierarchical data, and I don't even understand it that well. I read about in a book called SQL for Smarties where Joe Celko devotes 2 entire chapters to working with hierarchical data in databases that aren't really designed to. I figured I should at least make a reference to this method since it purportedly has so many great advantages to the traditional parent-child relationship model.

The basic idea is that you start at the right-most branch of the root of your tree and give it a 1 where the relationship starts and a 2 where it ends. Then you take that node's right-most relationship and give it a 3 where it starts and a 4 where it ends. So that node ends up with 2 numbers: left and right. Decrement the "left" and you get the "right" of it's parent, increment the "right" and you get either it's child or it's sibling at the same level. Continue that idea all the way around the tree until you are back at the root again. Each object ends up with 2 numbers and from those number you can run some fun queries and come up with lots of information about that particular node, the tree in general, paths between different nodes and lots of other stuff. But as I said, I don't fully understand it yet. Maybe when I do, I'll write another article.

Monday, April 27, 2009

Oracle HTTP Server: EMDAEMON HTTP/1.1 404 , Not Found, when trying to open it with port 80

see http://forums.oracle.com/forums/thread.jspa?threadID=374391

Stopped Oracle DBConsole services in Windows Services.

Went to:
{drive}:{Oracle_home}\sysman\config\emd.properties

Changed:

# EMD main servlet URL
#
EMD_URL=http://{Your dns name}:/emd/main/

Note the "colon" with missing port number, so this may default to port 80

to:

# EMD main servlet URL
#
EMD_URL=http://{Your dns name}:1830/emd/main/

You may have to use another port besides "1830"
if that is being used by your system.

Rebooted system.
Simply restarting DBConsole services was not enough.

Found that EMAGENT.EXE was no longer listening on IP 0.0.0.0 and Port 80.

Changed Apache to Port 80,
Stopped and restarted Apache.

HTMLDB works on Port 80
and
Oracle Database Control works on Port 1158 (default).

Tuesday, April 21, 2009

Oracle HTTP Server: PERMISSION DENIED: MAKE_SOCK: COULD NOT BIND TO PORT 80

See http://download-east.oracle.com/docs/cd/B15904_01/web.1012/b14008.pdf, page D-3

You have to run the following:

1. Log in as root.
2. Run the following commands in the middle-tier Oracle home:
cd ORACLE_HOME/Apache/Apache/bin
chown root .apachectl
chmod 6750 .apachectl

Thursday, March 12, 2009

Tomcat Error: ...is quoted with " which must be escaped when used within the value

From Apache Tomcat 5.5.27 onwards, it seems that Tomcat is picky about double quoted expressions:

ERROR [http-80-Processor22] - Servlet.service() for servlet jsp threw exception
org.apache.jasper.JasperException: /forum/index.jsp(73,36) Attribute value request.getRequestURI()+"?"+request.getQueryString() is quoted with " which must be escaped when used within the value
at org.apache.jasper.compiler.DefaultErrorHandler.jspError(DefaultErrorHandler.java:40)
at org.apache.jasper.compiler.ErrorDispatcher.dispatch(ErrorDispatcher.java:407)
at org.apache.jasper.compiler.ErrorDispatcher.jspError(ErrorDispatcher.java:198)
at org.apache.jasper.compiler.Parser.parseQuoted(Parser.java:307)
at org.apache.jasper.compiler.Parser.parseAttributeValue(Parser.java:250)
at org.apache.jasper.compiler.Parser.parseAttribute(Parser.java:211)
at org.apache.jasper.compiler.Parser.parseAttributes(Parser.java:149)
at org.apache.jasper.compiler.Parser.parseParam(Parser.java:875)
at org.apache.jasper.compiler.Parser.parseBody(Parser.java:1829)
at org.apache.jasper.compiler.Parser.parseOptionalBody(Parser.java:1081)
at org.apache.jasper.compiler.Parser.parseInclude(Parser.java:909)
at org.apache.jasper.compiler.Parser.parseStandardAction(Parser.java:1232)
at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1588)
at org.apache.jasper.compiler.Parser.parse(Parser.java:132)
at org.apache.jasper.compiler.ParserController.doParse(ParserController.java:212)
at org.apache.jasper.compiler.ParserController.parse(ParserController.java:101)
at org.apache.jasper.compiler.Compiler.generateJava(Compiler.java:156)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:296)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:277)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:265)
at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:564)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:302)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:329)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at com.cj.trim.trimFilter.doFilter(Unknown Source)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at gr.knowledge.util.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:121)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Unknown Source)




See the relevant https://issues.apache.org/bugzilla/show_bug.cgi?id=45015

It seems that, you can either edit your jsp code to be more compliant or get rid of the error by editing catalina.properties and putting this extra line:

org.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false
This actually tells Tomcat to be more relaxed about JSP compliance regarding double quoted expressions.

See also http://tomcat.apache.org/tomcat-5.5-doc/config/systemprops.html

Monday, March 9, 2009

javax.mail.MessagingException: 501 Syntax: HELO hostname

If your java mailer returns

javax.mail.MessagingException: 501 Syntax: HELO hostname

upon sending an email, most probably you have to set a hostname to the machine running java (not the SMTP server per se)

Run "hostname javaservername " and / or edit /etc/sysconfig/network to note the server name.

Tuesday, March 3, 2009

context XML file deleted when path is not readable

If your Tomcat application has a docbase which is a network place (being it a Windows share or a NFS Linux share) and temporarily the connection between the two machines is lost, chances are that the context.xml of your app as it is deployed in \conf\Catalina\localhost will be automatically removed.


In order to avoid this, edit your server.xml file and set autoDeploy="false" in the Host tag.

See https://issues.apache.org/bugzilla/show_bug.cgi?id=40050

slow login in OpenSSH

if you ever systematically experience slow initial connection (login) in an OpenSSH server using eg WinSCP, try edit /etc/ssh/sshd_config disabling reverse DNS lookups:

UseDNS no

In my case (CentOS 5.2 server, WinSCP/Putty client), this solved the problem, and the connections run REALLY fast now

Monday, February 23, 2009

EXP-00056: ORACLE error 932 encountered ORA-00932: inconsistent datatypes: expected BLOB, CLOB got CHAR

see http://www.oratransplant.nl/category/oracle/database/bugs-and-issues/:





We’re in the middle of converting our 8.1.7.4 database to 10g release 2 (10.2). With this migration we’re also changing the character set of the database to AL32UTF8 and changing the database to char semantics. This gave us serious problems trying to export the database.

First a short introduction on the choice of character set and length semantics: Using Unicode as a character set will allow us to store any character and not just the western European characters we’re currently limited to. Unicode is a multi-byte character set which means that a single character could use more than a single byte. So what do you mean by VARCHAR2(10)? Is this 10 characters or 10 bytes? This used to be the same with our single byte character set, but these days are over. You can specify VARCHAR2(10 BYTE) or VARCHAR2(10 CHAR) if you want to be specific. However, all of our existing code doesn’t use this syntax. This is where the NLS_LENGTH_SEMANTICS parameter comes in. You can set this to BYTE or CHAR basically defining which one of the two is the default when not specifying it explicitly. A default installation will use BYTE semantics, but we decided to change it to CHAR. To me that is the more logical option.

So, we created a fresh database with NLS_LENGTH_SEMANTICS=CHAR. If you then run an export:

C:\>set ORACLE_SID=dev01


C:\>set NLS_LANG=AMERICAN_AMERICA.AL32UTF8

C:\>exp '/ as sysdba' full=y file=full.dmp

Export: Release 10.2.0.1.0 - Production on Tue Nov 22 09:48:06

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to: Personal Oracle Database 10g Release 10.2.0.1.0
With the Partitioning, OLAP and Data Mining options
Export done in AL32UTF8 character set and AL16UTF16 NCHAR
character set

About to export the entire database ...
. exporting tablespace definitions
. exporting profiles
. exporting user definitions
. exporting roles
. exporting resource costs
. exporting rollback segment definitions
. exporting database links
. exporting sequence numbers
. exporting directory aliases
. exporting context namespaces
. exporting foreign function library names
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions
. exporting system procedural objects and actions
. exporting pre-schema procedural objects and actions
. exporting cluster definitions
EXP-00056: ORACLE error 932 encountered
ORA-00932: inconsistent datatypes: expected BLOB, CLOB got CHAR
EXP-00056: ORACLE error 932 encountered
ORA-00932: inconsistent datatypes: expected BLOB, CLOB got CHAR
EXP-00000: Export terminated unsuccessfully

Searching MetaLink made me find Note 339938.1 which describes a situation where a 10.2 full export fails with an ORA-932 when exporting the cluster definitions. I figured I found the issue and a solution. But the note describes that some SYS objects should be invalid whereas all my SYS objects are valid. It also describes that a SELECT on sys.ku$_xmlschema_view should fail. This one does fail on me:

SQL> select url, local, stripped_val from sys.ku$_xmlschema_view;

select url, local, stripped_val from sys.ku$_xmlschema_view
*
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected BLOB, CLOB got CHAR

Now it was time to open a Service Request (the new name for a TAR) on MetaLink. In the end the support engineer came with the solution. It all boiled down to two pieces of advice:

  • only switch to char semantics after database creation (that has no impact on the “char semantics support”, any SYS objects cannot be CHAR and that is what happened here)
  • use a “startup migrate” when running any patch or catalog scripts

So I shouldn’t have set NLS_LENGTH_SEMANTICS=CHAR before creating the database. The second advice to always use STARTUP MIGRATE is also a good tip. This will force the system of doing anything in byte semantics.

Here are the steps to “repair” my database. First check that the NLS_LENGTH_SEMANTICS is indeed set to CHAR. Then shutdown the database and restart as “startup migrate” You can see that the length semantics are now BYTE:

C:\>sqlplus / as sysdba


SQL*Plus: Release 10.2.0.1.0 - Production on Tue Nov 22 10:00:46

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to:
Personal Oracle Database 10g Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> column parameter format a20
SQL> column value format a5
SQL> select * from nls_session_parameters
2 where PARAMETER='NLS_LENGTH_SEMANTICS';

PARAMETER VALUE
-------------------- -----
NLS_LENGTH_SEMANTICS CHAR

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup migrate;
ORACLE instance started.

Total System Global Area 469762048 bytes
Fixed Size 1249608 bytes
Variable Size 146804408 bytes
Database Buffers 314572800 bytes
Redo Buffers 7135232 bytes
Database mounted.
Database opened.
SQL> select * from nls_session_parameters
2 where PARAMETER='NLS_LENGTH_SEMANTICS';

PARAMETER VALUE
-------------------- -----
NLS_LENGTH_SEMANTICS BYTE

Now drop the XML DB stuff. After that exit SQL*Plus. There is some bug that screws up your session. This means you’ll have to close SQL*Plus at the end:

SQL> @?\RDBMS\ADMIN\catnoqm.sql


PL/SQL procedure successfully completed.

User dropped.

PL/SQL procedure successfully completed.

Table dropped.

SQL> exit
ERROR:
ORA-00600: internal error code, arguments: [729], [256],
[space leak], [], [], [], [], []

Disconnected from Personal Oracle Database 10g Release 10.2.0.1.0
With the Partitioning, OLAP and Data Mining options
(with complications)

There is a bug that says you have to restart the database after running catnoqm. Be sure to start it with “startup migrate” again:

C:\>sqlplus / as sysdba


SQL*Plus: Release 10.2.0.1.0 - Production on Tue Nov 22 10:05:17

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to:
Personal Oracle Database 10g Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup migrate;
ORACLE instance started.

Total System Global Area 469762048 bytes
Fixed Size 1249608 bytes
Variable Size 146804408 bytes
Database Buffers 314572800 bytes
Redo Buffers 7135232 bytes
Database mounted.
Database opened.
SQL> column parameter format a20
SQL> column value format a5
SQL> select * from nls_session_parameters
2 where PARAMETER='NLS_LENGTH_SEMANTICS';

PARAMETER VALUE
-------------------- -----
NLS_LENGTH_SEMANTICS BYTE

Now run the CATQM.SQL script which requires three arguments (initial password and two tablespaces):

SQL> @?\RDBMS\ADMIN\catqm.sql change_on_install SYSAUX TEMP


Table created.

Table created.

[[ lot of succesfull statements removed for readability ]]

Package dropped.

Package dropped.

Function dropped.

Commit complete.

Now shutdown and start the database normally. You can check that the length semantics are back to its original (CHAR) and the query from sys.ku$_xmlschema_view now works fine:

SQL> shutdown immediate;

Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 469762048 bytes
Fixed Size 1249608 bytes
Variable Size 150998712 bytes
Database Buffers 310378496 bytes
Redo Buffers 7135232 bytes
Database mounted.
Database opened.
SQL> column parameter format a20
SQL> column value format a5
SQL> select * from nls_session_parameters
2 where PARAMETER='NLS_LENGTH_SEMANTICS';

PARAMETER VALUE
-------------------- -----
NLS_LENGTH_SEMANTICS CHAR

SQL> select url, local, stripped_val
2 from sys.ku$_xmlschema_view;

URL
----------------------------------------------------------------
LOCAL
----------
STRIPPED_VAL
----------------------------------------------------------------
http://xmlns.oracle.com/xdb/XDBSchema.xsd
0
MetaLink Note 144808.1 for more information and limits of using BYTE and CHAR semantics. This case will be added to this Note.

Update 17-dec-2005: Oracle filed bug 4886376 to have this problem fixed in a future release. The bug description also indicates that AL32UTF8 will be the default character set for a version 11.x database.