CUBRID PL/CSQL — Oracle-Compatible Procedural SQL Compiled to Java in the PL Family Runtime
Contents:
- Relationship to the PL Family
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Cross-check Notes
- Open Questions
- Sources
Relationship to the PL Family
Section titled “Relationship to the PL Family”This document covers PL/CSQL — CUBRID’s Oracle-dialect procedural
SQL. Its sibling document cubrid-pl-javasp.md covers JavaSP, the
Java-stored-procedure sub-system. The two form the PL family
(pl-family tag) because they share almost everything except the
front-end that turns user input into a callable Java class.
What the two forms share — the entire src/sp/ C glue layer and
the pl_engine/pl_server JVM are common to both:
- Catalog rows. Both kinds are stored in the same system catalog
classes —
_db_stored_procedure,_db_stored_procedure_args, and_db_stored_procedure_code. The structures insp_catalog.hpp(sp_info,sp_arg_info,sp_code_info) and the DML functions insp_catalog.cpp(sp_add_stored_procedure,sp_add_stored_procedure_argument,sp_add_stored_procedure_code) are used by both without language differentiation at the catalog level. - Transport.
pl_connection.cpp/pl_comm.cown the Unix-domain-socket (or TCP) connection pool between the C server and the JVM. Both PL/CSQL and JavaSP calls travel this socket; theconnection_poolandconnection_viewtypes manage pooled connections;pl_server_init/pl_server_destroy/pl_server_wait_for_ready(inpl_sr.h) start and stop the shared JVM process. - Executor and session.
pl_executor.cppconstructs theinvoke_javamessage that crosses the wire.pl_session.cppholds the per-CUBRID-session execution stack (session::create_and_push_stack). The sameExecuteThread.javainside the JVM handles both kinds of execution requests. - JVM host.
Server.java/ListenerThread.java/ExecuteThread.javaare the JVM-side runtime; they are not PL/CSQL-specific. - Wire protocol. Both ride the same
RequestCode-tagged binary protocol (Header,CUBRIDPacker/CUBRIDUnpacker). A PL/CSQL program that has been compiled to a Java class is indistinguishable from a hand-written JavaSP at execution time.
What is PL/CSQL-specific — not shared with JavaSP:
- ANTLR grammars.
PlcLexer.g4,PlcParser.g4, andStaticSqlWithRecords.g4tokenise and parse PL/CSQL syntax. JavaSP source is never parsed by CUBRID; it is provided pre-compiled as.classor.jar. - AST. The
compiler/ast/tree ofDecl*,Stmt*,Expr*, andUnitclasses is the PL/CSQL intermediate representation. JavaSP has no such representation. - Semantic analysis.
TypeChecker(visitor over the AST) resolves names, checks types, and emitsSemanticError. It calls back to the C server viaServerAPI/SqlSemanticsto type-check embedded static SQL at compile time — a step that has no analogue in JavaSP. - Type system. The
compiler/type/hierarchy (Type,TypeChar,TypeVarchar,TypeNumeric,TypeRecord,TypeVariadic) models PL/CSQL’s static type universe.CoercionandCoercionSchemeencode the implicit conversion rules. - Code generation.
JavaCodeWriterwalks the checked AST and emits a Java source string.MemoryJavaCompiler(wrappingjavax.tools.JavaCompiler) compiles it in-process.pl_compile_handler.cppcoordinates this entire compile-on-DDL path from the C side.
The architectural insight: a PL/CSQL CREATE PROCEDURE causes the
CUBRID server to drive a compile round-trip to the JVM (via
compile_handler::compile), have the JVM translate the source to Java
and compile it with javac in memory, return the resulting bytecode, and
store it in the catalog. From that point on, calling the procedure is
identical to calling a JavaSP — which is exactly why these two docs
share infrastructure and live as one PL family.
Theoretical Background
Section titled “Theoretical Background”A procedural SQL language is the textbook answer to a recurring need: relational SQL is set-oriented and declarative, but real applications also need control flow, local variables, exception handling, and user-defined abstractions that live inside the database. The SQL/PSM specification (ISO/IEC 9075-4) standardises the shape of such a language; every major engine ships its own dialect that implements roughly the SQL/PSM core plus vendor-specific sugar. Database System Concepts (Silberschatz/Korth/Sudarshan, ch. 5 “Advanced SQL”) gives the neutral framing; vendor manuals — Oracle’s PL/SQL Language Reference, Postgres’s PL/pgSQL — SQL Procedural Language, Microsoft’s T-SQL reference — define what users actually write.
Three threads of theory feed every implementation and frame the rest of this document:
-
The PL/SQL block model. Every Oracle-compatible procedural SQL has the same syntactic skeleton:
DECLARE …(variables, cursors, exceptions),BEGIN …(executable statements),EXCEPTION WHEN …(handlers),END;. Blocks nest, scoping is lexical, and exception propagation walks the dynamic stack of blocks. Anonymous blocks are first-class; named procedures and functions are blocks with a header. -
AST interpretation versus target-language emission. A procedural-SQL implementation has two reasonable execution strategies. The interpreter keeps the AST in memory and walks it at run time (Postgres
plpgsqldoes this — every stored function is an AST tree underPLpgSQL_function, evaluated byexec_stmtand friends insrc/pl/plpgsql/src/pl_exec.c). The emitter translates the AST into a host language and lets the host’s normal compile/JIT path execute it (Oracle compiles PL/SQL into MCODE; DB2 SQL PL compiles into native C; CUBRID emits Java source and feeds it tojavac). Emission trades compile-time cost for run-time speed, and inherits the host’s debugger and profiler. CUBRID chose emission because it already runs an external JVM (pl_server) for JavaSP, so re-usingjavax.tools.JavaCompilerwas free. -
LL(*) parser generators (ANTLR). ANTLR 4 (Parr, The Definitive ANTLR 4 Reference, 2nd ed.) generates an adaptive LL(*) parser from a
.g4grammar; the result is a parse tree whose nodes areorg.antlr.v4.runtime.ParserRuleContextinstances. Walking the parse tree is decoupled from parsing: the generatedXxxBaseVisitor<R>lets a hand-written visitor produce anything (an AST in another shape, type information, code). ANTLR is the dominant grammar tool in the JVM ecosystem; Presto, Hive, JSqlParser, and CUBRID’s PL/CSQL all use it.
After these threads are named, every CUBRID-specific structure in this document either implements one of them or makes the generated artifact persistable.
Common DBMS Design
Section titled “Common DBMS Design”Every engine that ships a procedural SQL adopts roughly the same set of patterns. The difference is the back-end of the compiler.
Same outer shape: lex → parse → semantic check → emit
Section titled “Same outer shape: lex → parse → semantic check → emit”Whether the back-end is interpretation or emission, the front
half is identical: text → token stream → parse tree → semantic
analysis (name resolution + type checking) → IR. The IR can be
the parse tree itself (if the back-end is an interpreter) or a
host-language source string (if the back-end is javac /
gcc / native code).
Catalogged by name; the source is stored
Section titled “Catalogged by name; the source is stored”In every engine the procedure name lives in a system catalog
(pg_proc, dba_procedures, CUBRID’s _db_stored_procedure),
the original source text is stored on disk so users can read it
back, and the compiled artifact (bytecode / MCODE / Java JAR)
is stored either inline or in a side table.
Embedded SQL is special
Section titled “Embedded SQL is special”A procedural language hosts SQL inside its statements. Two
problems arise. First, the SQL grammar is large; embedding it
as-is bloats the procedural grammar. Engines either (a) call
out to the SQL parser at compile time
(SELECT INTO …; is reparsed by the host’s main SQL parser —
Postgres does this) or (b) use a “skeleton” embedded grammar
that captures only enough structure to find the boundary of
the SQL fragment, then ship the raw text to the server for
parsing. Second, host variables must be substituted: the
compiler rewrites INTO v into bind parameters and binds
the locals at execution time.
Cursor lifecycle is a state machine
Section titled “Cursor lifecycle is a state machine”OPEN cur allocates server-side resources, FETCH advances,
CLOSE releases. Implicit attributes (%FOUND, %NOTFOUND,
%ROWCOUNT, %ISOPEN) and ROWTYPE record types are
syntactic sugar over this state machine.
Exceptions propagate dynamically
Section titled “Exceptions propagate dynamically”Named exceptions (NO_DATA_FOUND, TOO_MANY_ROWS,
user-declared) plus RAISE and RAISE_APPLICATION_ERROR
have dynamic scope: an exception unwinds through enclosing
blocks until a WHEN handler matches or the call stack is
exhausted. The natural mapping in a Java back-end is
throw … catch …; in an AST interpreter it is a setjmp/longjmp
or a Java-level exception.
Where CUBRID sits on this map
Section titled “Where CUBRID sits on this map”| Engine | Front | IR | Back-end | Storage |
|---|---|---|---|---|
| Oracle | hand | PCODE | MCODE / native dispatcher | dba_source |
| Postgres | bison | AST tree | tree-walk interpreter (pl_exec) | pg_proc |
| DB2 SQL PL | bison | C source | external C compiler | catalog |
| CUBRID | ANTLR | AST tree | emit Java → in-process javac | _db_stored_procedure_code |
CUBRID is closest to DB2 in spirit (compile to a host language
that already has a compiler), but the host is Java rather than
C, the compiler is in-process (ToolProvider.getSystemJavaCompiler()),
and the engine for the resulting class is the same JVM that
hosts JavaSP.
CUBRID’s Approach
Section titled “CUBRID’s Approach”PL/CSQL lives entirely in pl_engine/pl_server/. Compilation is
triggered from C-side cub_server when DDL fires
CREATE PROCEDURE … LANGUAGE PLCSQL; the source is shipped over
a Unix domain socket to the running pl_server JVM, parsed,
type-checked, lowered to Java source, compiled by in-process
javac, packaged as a JAR (Base64-encoded) and shipped back. The
catalog row is _db_stored_procedure (name, signature, language)
and the JAR blob is stored in _db_stored_procedure_code next to
the textual source.
Grammar — PlcLexer.g4 and PlcParser.g4
Section titled “Grammar — PlcLexer.g4 and PlcParser.g4”The grammars are forked from the
antlr/grammars-v4/sql/plsql Oracle-PL/SQL grammar and trimmed
to what CUBRID supports. The lexer file
(PlcLexer.g4) is 356 lines; the parser
(PlcParser.g4) is 687 lines. Both live under
pl_engine/pl_server/src/main/antlr/ and are fed to ANTLR by
the Gradle generateGrammarSource task; the generated classes
land in com.cubrid.plcsql.compiler.antlrgen and are checked in
by Gradle’s build cache, not the repo.
The most distinctive lexer trick is mode-switching between
PL/CSQL keywords and embedded static SQL. The lexer starts in
DEFAULT_MODE; encountering SELECT, INSERT, UPDATE,
DELETE, WITH, MERGE, REPLACE, or TRUNCATE switches
into STATIC_SQL mode, in which the rest of the SQL fragment
is consumed as opaque tokens until a balancing ; (or ))
restores DEFAULT_MODE. This is the only practical way to
embed CUBRID’s full SQL grammar without duplicating it inside
the procedural grammar.
// keywords that starts Static SQL — PlcLexer.g4WITH: W I T H { staticSqlParenMatch++; mode(STATIC_SQL); };SELECT: S E L E C T { staticSqlParenMatch++; mode(STATIC_SQL); };INSERT: I N S E R T { staticSqlParenMatch++; checkFirstLParen = true; mode(STATIC_SQL); };UPDATE: U P D A T E { staticSqlParenMatch++; mode(STATIC_SQL); };DELETE: D E L E T E { staticSqlParenMatch++; mode(STATIC_SQL); };// ...mode STATIC_SQL;SS_SEMICOLON : ';' { setType(PlcParser.SEMICOLON); staticSqlParenMatch = -1; checkFirstLParen = false; mode(DEFAULT_MODE); };SS_NON_STR : ~( ';' | '\'' | ' ' | '\t' | '\r' | '\n' | '(' | ')' | '?' | '"' | '[' | '`' )+ ;SS_BIND_PARAM : '?' keeps placeholder support inside the
opaque region; the SQL fragment is replayed verbatim against
the C-side parser at compile time (see “Semantic analysis”
below). The mode trick also handles the ambiguity that some
keywords (INSERT, REPLACE, TRUNCATE) are both statement
heads and built-in functions — checkFirstLParen watches the
first character after the keyword and bounces back to
DEFAULT_MODE if it sees (, treating the keyword as a
function call instead.
The parser carries the full PL/SQL block model. The top-level non-terminals tell the story:
// PlcParser.g4 — the routine and block grammar (condensed)sql_script : create_routine EOF ;
create_routine : CREATE (OR_REPLACE)? routine_definition (COMMENT CHAR_STRING)? ;
routine_definition : (PROCEDURE | FUNCTION) routine_uniq_name ( (LPAREN parameter_list RPAREN)? | LPAREN RPAREN ) (RETURN type_spec)? (authid_spec? deterministic_spec? | deterministic_spec authid_spec) (IS | AS) (LANGUAGE PLCSQL)? seq_of_declare_specs? body (SEMICOLON)? ;
declare_spec : pragma_declaration | constant_declaration | exception_declaration | variable_declaration | cursor_definition | routine_definition ; // nested routines
block : (DECLARE seq_of_declare_specs)? body ;
body : BEGIN seq_of_statements (EXCEPTION exception_handler+)? END label_name? ;statement is the canonical Oracle-PL/SQL set: if_statement,
five flavours of loop_statement, case_statement, cursor
manipulation (OPEN, CLOSE, FETCH, OPEN FOR),
raise_statement, raise_application_error_statement,
execute_immediate, commit_statement, rollback_statement,
plain assignment, and procedure calls. The if-elsif-else shape
is straightforward:
// PlcParser.g4 — if statementif_statement : IF expression THEN seq_of_statements elsif_part* else_part? END IF ;
elsif_part : ELSIF expression THEN seq_of_statements ;
else_part : ELSE seq_of_statements ;Cursors are equally close to PL/SQL:
// PlcParser.g4 — cursor declarationcursor_definition : CURSOR identifier ( (LPAREN cursor_parameter_list RPAREN)? | LPAREN RPAREN ) IS static_sql SEMICOLON ;
cursor_parameter : parameter_name IN? type_spec ;The third grammar, StaticSqlWithRecords.g4, is a secondary
grammar invoked only when an embedded INSERT/UPDATE/REPLACE has
the … VALUES record or … SET ROW = record syntax — the
patterns that need PL/CSQL records to be expanded into column
lists at compile time:
// StaticSqlWithRecords.g4 — record syntactic sugarstmt_w_record_values : INSERT any+ (VALUES | VALUE) record_list (ON DUPLICATE KEY UPDATE any+)? EOF | REPLACE any+ (VALUES | VALUE) record_list EOF ;
stmt_w_record_set : UPDATE any* table_spec SET row_set any* EOF | INSERT any* table_spec SET row_set (ON DUPLICATE KEY UPDATE any+)? EOF | REPLACE any* table_spec SET row_set EOF ;
row_set : ROW '=' record ;record : REGULAR_ID ;Most of the SQL syntax is opaque (OPAQUE token) — only the
VALUES … / SET ROW = … islands are extracted, because only
those need PL/CSQL-side expansion before the SQL fragment is
sent to CUBRID’s main SQL parser.
Front-end — parse tree to AST
Section titled “Front-end — parse tree to AST”PlcsqlCompilerMain.compileInner orchestrates the full
pipeline. It wires the ANTLR lexer/parser, runs
parser.sql_script() to obtain a ParseTree, and walks it with
ParseTreeConverter (3,449 lines — the bulk of the
front-end):
// PlcsqlCompilerMain.compileInner — pl_engine/.../compiler/PlcsqlCompilerMain.javaParseTree tree = parse(input, verbose, sqlTemplate, logStore);// ...ParseTreeConverter converter = new ParseTreeConverter(iStore, owner, revision);Unit unit = (Unit) converter.visit(tree);// ...converter.askServerSemanticQuestions();// ...TypeChecker typeChecker = new TypeChecker(iStore, converter.symbolStack, converter.dependenciesOfStaticSql, owner, sqlUsesInRecursiveCalls);typeChecker.visitUnit(unit);// ...String javaCode = new JavaCodeWriter(iStore, sqlUsesInRecursiveCalls).buildCodeLines(unit);The AST lives in compiler/ast/. Roughly 90 node classes split
into three families. Decl* for declarations:
DeclProgram (the top-level Unit’s routine — a DeclProc
or DeclFunc extending DeclRoutine), DeclVar,
DeclConst, DeclCursor, DeclException, DeclLabel,
DeclParamIn, DeclParamOut. Stmt* for statements:
StmtBlock (anonymous block with optional declarations and
exception handlers), StmtIf, StmtBasicLoop, StmtWhileLoop,
StmtForIterLoop, StmtForCursorLoop, StmtForStaticSqlLoop,
StmtCase, StmtCursorOpen, StmtCursorFetch,
StmtCursorClose, StmtOpenFor, StmtExecImme,
StmtAssign, StmtRaise, StmtRaiseAppErr, StmtReturn,
StmtCommit, StmtRollback, StmtContinue, StmtExit,
StmtNull, StmtLocalProcCall, StmtGlobalProcCall,
StmtStaticSql. Expr* for expressions: ExprBinaryOp,
ExprUnaryOp, ExprId, ExprField, ExprUint, ExprFloat,
ExprStr, ExprDate, ExprDatetime, ExprTime,
ExprTimestamp, ExprNull, ExprTrue, ExprFalse,
ExprBetween, ExprIn, ExprLike, ExprCase, ExprCond,
ExprCursorAttr, ExprSqlCode, ExprSqlErrm,
ExprSqlRowCount, ExprSerialVal, ExprAutoParam (used to
auto-bind PL/CSQL locals into static SQL), ExprGlobalFuncCall,
ExprLocalFuncCall, ExprBuiltinFuncCall, ExprSyntaxedCallCast,
ExprSyntaxedCallChr, ExprSyntaxedCallAdddate,
ExprSyntaxedCallSubdate, ExprSyntaxedCallExtract,
ExprSyntaxedCallPosition, ExprSyntaxedCallTrim. The shape of
a node is uniform — a small dataclass with a constructor that
carries the ANTLR ParserRuleContext for line/column tracking,
plus an accept method that dispatches to AstVisitor:
// StmtIf — pl_engine/.../compiler/ast/StmtIf.javapublic class StmtIf extends Stmt { @Override public <R> R accept(AstVisitor<R> visitor) { return visitor.visitStmtIf(this); }
public final boolean forIfStmt; public final NodeList<CondStmt> condStmtParts; public final NodeList<Stmt> elsePart;
public StmtIf(ParserRuleContext ctx, boolean forIfStmt, NodeList<CondStmt> condStmtParts, NodeList<Stmt> elsePart) { super(ctx); this.forIfStmt = forIfStmt; this.condStmtParts = condStmtParts; this.elsePart = elsePart; }}Hierarchically:
classDiagram
class AstNode
class Decl
class Stmt
class Expr
AstNode <|-- Decl
AstNode <|-- Stmt
AstNode <|-- Expr
AstNode <|-- Body
AstNode <|-- ExHandler
AstNode <|-- TypeSpec
AstNode <|-- CondStmt
Decl <|-- DeclId
DeclId <|-- DeclIdTypeSpeced
DeclIdTypeSpeced <|-- DeclVar
DeclIdTypeSpeced <|-- DeclConst
DeclIdTypeSpeced <|-- DeclParam
DeclParam <|-- DeclParamIn
DeclParam <|-- DeclParamOut
Decl <|-- DeclRoutine
DeclRoutine <|-- DeclFunc
DeclRoutine <|-- DeclProc
Decl <|-- DeclCursor
Decl <|-- DeclException
Decl <|-- DeclLabel
Stmt <|-- StmtBlock
Stmt <|-- StmtIf
Stmt <|-- StmtBasicLoop
Stmt <|-- StmtWhileLoop
Stmt <|-- StmtForIterLoop
Stmt <|-- StmtForCursorLoop
Stmt <|-- StmtForStaticSqlLoop
Stmt <|-- StmtAssign
Stmt <|-- StmtCursorOpen
Stmt <|-- StmtCursorFetch
Stmt <|-- StmtCursorClose
Stmt <|-- StmtOpenFor
Stmt <|-- StmtExecImme
Stmt <|-- StmtRaise
Stmt <|-- StmtReturn
Expr <|-- ExprBinaryOp
Expr <|-- ExprUnaryOp
Expr <|-- ExprId
Expr <|-- ExprField
Expr <|-- ExprUint
Expr <|-- ExprStr
Expr <|-- ExprNull
Expr <|-- ExprCursorAttr
Expr <|-- ExprAutoParam
Expr <|-- ExprGlobalFuncCall
Expr <|-- ExprLocalFuncCall
Expr <|-- ExprBuiltinFuncCall
AstVisitor<R> (in compiler/visitor/AstVisitor.java, 221
lines) declares an abstract R visitXxx(...) for every node;
buildPath from concrete visitors that follows is below.
Semantic analysis — symbol table and type system
Section titled “Semantic analysis — symbol table and type system”ParseTreeConverter builds the AST and a SymbolStack
in one pass — every declaration is pushed onto the symbol
stack as it is encountered, every identifier reference is
resolved against the current frame. The stack has two static
levels (LEVEL_PREDEFINED = 0 for built-ins like SQRT and
the operator overloads, LEVEL_MAIN = 1 for the routine being
compiled) plus one dynamic level per BEGIN block:
// SymbolStack — pl_engine/.../compiler/SymbolStack.javapublic static final int LEVEL_PREDEFINED = 0;public static final int LEVEL_MAIN = 1;
private static SymbolTable predefinedSymbols = new SymbolTable(new Scope(null, null, "%predefined_0", LEVEL_PREDEFINED));
private static void addOperatorDecls() { // add SpLib static methods corresponding to operators Class c = Class.forName("com.cubrid.plcsql.predefined.sp.SpLib"); Method[] methods = c.getMethods(); for (Method m : methods) { if ((m.getModifiers() & Modifier.STATIC) > 0) { String name = m.getName(); if (name.startsWith("op")) { Operator opAnnot = m.getAnnotation(Operator.class); if (opAnnot != null) { // ... reflectively pull parameter types and register an // operator overload set for each PL/CSQL operator putOperator(name, op, opAnnot.coercionScheme()); } } } }}The unusual part is that operator implementations live in a
runtime class — com.cubrid.plcsql.predefined.sp.SpLib —
and the compiler reads them by reflection at startup. Each
public static method on SpLib whose name begins with op
and that carries @Operator(coercionScheme=...) is registered
as a PL/CSQL operator with its parameter types pulled from the
JVM signature. The same class also holds the runtime exception
types (CASE_NOT_FOUND, STORAGE_ERROR, PROGRAM_ERROR,
NO_DATA_FOUND, TOO_MANY_ROWS, etc.) that the emitted Java
code throws.
The type system (compiler/type/Type.java, 226 lines) is a
small fixed lattice: BOOLEAN, STRING, SHORT, INT,
BIGINT, NUMERIC, FLOAT, DOUBLE, DATE, TIME,
TIMESTAMP, DATETIME, SYS_REFCURSOR, plus three internal
ones (CURSOR, NULL, RECORD_ANY) used only by the
type-checker. Each Type holds the PL/CSQL name (plcName),
the fully qualified Java type (fullJavaType,
e.g. java.lang.Integer), and the unqualified Java code (javaCode)
that gets emitted. Coercion is a separate module
(Coercion.java, 534 lines, with CoercionScheme.java for
operator-level rules at 665 lines) — it picks the right
SpLib.convertXxxToYyy static method to insert when the
source type does not match the target.
The TypeChecker (visitor/TypeChecker.java, 1,491 lines)
walks the AST, decorates each Expr node with its inferred
type, picks an operator overload for every binary/unary op,
and inserts Coercion nodes where types disagree. It also
collects two crucial side artifacts: dependencies (the set
of _db_class, _db_serial, and _db_stored_procedure names
the compiled routine references — written into the catalog so
schema changes can invalidate the compiled artifact) and
sqlUsesInRecursiveCalls (the set of SqlUse AST sites that
sit inside a loop and therefore benefit from prepared-statement
caching).
Asking the C side for global semantics
Section titled “Asking the C side for global semantics”Some semantic checks need information the Java compiler cannot
have: the signature of a globally-declared procedure, whether a
name resolves to a serial, the type of a column that
%TYPE/%ROWTYPE references. ParseTreeConverter collects
these as global_semantics_question entries and
PlcsqlCompilerMain.compileInner flushes them to the C side in
one batch via askServerSemanticQuestions. The wire format is
defined in pl_struct_compile.cpp:
// global_semantics_request — src/sp/pl_struct_compile.cppvoidglobal_semantics_request::pack (cubpacking::packer &serializator) const{ serializator.pack_int (code); // METHOD_CALLBACK_GET_GLOBAL_SEMANTICS serializator.pack_all (qsqs); // vector of global_semantics_question}The C side answers with one
global_semantics_response_udpf / _serial / _column per
question, and the compile loop in
pl_compile_handler::compile keeps reading replies until the
final METHOD_REQUEST_COMPILE arrives:
// pl_compile_handler::compile — src/sp/pl_compile_handler.cppdo { error_code = read_request (response_blk, code); if (error_code == NO_ERROR) { cubmem::block &payload_blk = m_stack->get_data_queue().front ();
if (code == METHOD_REQUEST_COMPILE) { // final compiled artifact — copy out and exit the loop out_blk.extend_to (payload_blk.dim); std::memcpy (out_blk.get_ptr (), payload_blk.ptr, payload_blk.dim); } else if (code == METHOD_REQUEST_SQL_SEMANTICS) { // forward semantic question to the SQL parser and reply error_code = m_stack->send_data_to_client_recv (bypass_block, request); } else if (code == METHOD_REQUEST_GLOBAL_SEMANTICS) { // forward global semantics question to catalog and reply error_code = m_stack->send_data_to_client_recv (bypass_block, request); } // ... } }while (error_code == NO_ERROR && code != METHOD_REQUEST_COMPILE);This is a callback compile — the JVM drives the conversation,
the C side answers, and the JVM signals completion by sending
METHOD_REQUEST_COMPILE with the final byte payload. The same
mechanism handles SQL semantics: when the type-checker sees an
embedded SELECT … INTO, it sends the SQL text to the C side,
which parses it with the main SQL parser, computes column
types, and replies with sql_semantics — including the
rewritten query (with ? placeholders for PL/CSQL bind
variables) that the runtime will actually execute.
The CAS-side responders for these two opcodes
(callback_handler::get_sql_semantics, get_global_semantics) and
the broader PL↔server callback channel they ride on are documented in
the third PL-family sibling, cubrid-pl-server-bridge.md
§“Compile-time bridge for PL/CSQL embedded SQL”.
Code generation — JavaCodeWriter
Section titled “Code generation — JavaCodeWriter”JavaCodeWriter (3,783 lines) is the largest file in the
compiler. Its strategy is string-template emission: each
visitXxx returns a CodeToResolve carrying a multiline
template with %'PLACEHOLDER'% slots that get expanded by
recursively visiting the child nodes. The Unit template wraps
the user’s body in a Java class with one public static
method:
// JavaCodeWriter — pl_engine/.../compiler/visitor/JavaCodeWriter.javaprivate static final String[] tmplUnit = new String[] { "%'+IMPORTS'%", "import static com.cubrid.plcsql.predefined.sp.SpLib.*;", "", "public class %'CLASS-NAME'% {", "", " public static %'RETURN-TYPE'% %'METHOD-NAME'%(", " %'+PARAMETERS'%", " ) throws Exception {", " try {", " %'GET-CONNECTION'%", " %'+MAIN-USER-CODE'%", " } catch (PlcsqlRuntimeError e) {", " Throwable c = e.getCause();", " int[] pos = getPlcLineColumn(codeRangeMarkerList, c == null ? e : c, \"%'CLASS-NAME'%.java\");", " throw e.setPlcLineColumn(pos);", " } catch (OutOfMemoryError e) {", " Server.log(e);", " throw new STORAGE_ERROR().setPlcLineColumn(...);", // ... more catch clauses for StackOverflowError, ClassCastException, Throwable " }", " }", " %'+RECORD-DEFS'%", " %'+RECORD-ASSIGN-FUNCS'%"};The class name is computed deterministically from the
procedure name plus a revision counter so re-creates do not
collide with cached class objects. The catch chain pins error
provenance: every runtime exception is decorated with
pos = getPlcLineColumn(codeRangeMarkerList, ...) so the user
sees a PL/CSQL line/column rather than a Java stack trace.
That codeRangeMarkerList is built side-by-side with the
emitted source — every CodeTemplate carries
Misc.getLineColumnOf(node.ctx), which gets stitched into a
range-marker string at the bottom of the class.
if/elsif/else is the textbook example. The PL/CSQL form
IF cond1 THEN s1; ELSIF cond2 THEN s2; ELSE s3; END IF;becomes
if (cond1) { s1; } else if (cond2) { s2; } else { s3; }via these two templates:
// visitStmtIf — pl_engine/.../compiler/visitor/JavaCodeWriter.javaprivate static String[] tmplStmtIfWithoutElse = new String[] {"%'+COND-PARTS'%"};
private static String[] tmplStmtIfWithElse = new String[] {"%'+COND-PARTS'% else {", " %'+ELSE-PART'%", "}"};
@Overridepublic CodeToResolve visitStmtIf(StmtIf node) { if (node.forIfStmt && node.elsePart == null) { return new CodeTemplate( "StmtIf", Misc.UNKNOWN_LINE_COLUMN, tmplStmtIfWithoutElse, "%'+COND-PARTS'%", visitNodeList(node.condStmtParts).setDelimiter(" else")); } else { Object elsePart = node.elsePart == null ? "throw new CASE_NOT_FOUND();" : visitNodeList(node.elsePart); return new CodeTemplate( "StmtIf", Misc.getLineColumnOf(node.ctx), tmplStmtIfWithElse, "%'+COND-PARTS'%", visitNodeList(node.condStmtParts).setDelimiter(" else"), "%'+ELSE-PART'%", elsePart); }}Two subtleties. First, forIfStmt distinguishes a true PL/CSQL
IF from a CASE lowered into the same StmtIf node — the
CASE form emits throw new CASE_NOT_FOUND() for the missing
else, matching Oracle’s semantics. Second, the
setDelimiter(" else") call on the NodeList produces
if (...) {} else if (...) {} else if (...) {} from a list of
CondStmt children, which is far cleaner than emitting else if pairs in the visitor.
Local procedure calls are wrapped in a single-use anonymous inner class so OUT parameters can be coerced and written back without leaking helpers into the user’s namespace:
// tmplStmtLocalProcCall — pl_engine/.../compiler/visitor/JavaCodeWriter.javaprivate static String[] tmplStmtLocalProcCall = new String[] { "new Object() { // local procedure call: %'PROC-NAME'%", " void invoke(%'PARAMETERS'%) throws Exception {", " %'+ALLOC-COERCED-OUT-ARGS'%", " %'BLOCK'%%'PROC-NAME'%(%'ARGS'%);", " %'+UPDATE-OUT-ARGS'%", " }", "}.invoke(", " %'+ARGUMENTS'%", ");"};The connection-required template at the top is also worth naming:
private static final String tmplGetConn = "final Connection conn = DriverManager.getConnection(\"jdbc:default:connection::?autonomous_transaction=%s\");";Every PL/CSQL routine that touches static SQL or EXECUTE IMMEDIATE opens a server-side default JDBC connection —
jdbc:default:connection — which routes back through
CUBRIDServerSideConnection (under
com.cubrid.jsp.jdbc/) to the same cub_server that called the
SP, so the SQL runs in the caller’s transaction. The
autonomous_transaction=true variant is selected when the
routine carries PRAGMA AUTONOMOUS_TRANSACTION, in which case
the JDBC driver opens a fresh top-level transaction.
Loop optimisation
Section titled “Loop optimisation”compiler/ast/loopOpt/ is a small package with three types:
LoopOptimizable (a marker interface), LocalRoutineCall,
and SqlUse. The optimisation question is: which static-SQL
sites and local-routine call sites are reachable from inside
a loop, and therefore deserve to be cached / prepared once?
TypeChecker collects these into the
sqlUsesInRecursiveCalls set; JavaCodeWriter reads the set
back to decide whether each StmtStaticSql should emit a
PreparedStatement that lives outside the loop body and gets
parameter-bound on each iteration, versus a per-iteration
createStatement().execute(text) for one-shot SQL.
// SqlUse — pl_engine/.../compiler/ast/loopOpt/SqlUse.javapublic interface SqlUse extends LoopOptimizable { void markAsReachableFromLoop(); int getSqlSerialNo(); boolean ofCallableStmt(); boolean usingRef(); void setToUseRef();}getSqlSerialNo() is a per-routine integer that the runtime
uses as a key into a Map<Integer, PreparedStatement> cache
held by the routine instance.
Built-ins and predefined runtime
Section titled “Built-ins and predefined runtime”pl_engine/.../com/cubrid/plcsql/builtin/ ships only one user-
visible package — DBMS_OUTPUT.java plus its MessageBuffer.java
helper. It implements enable, disable, getLine, getLines,
putLine, put, and newLine — the standard Oracle DBMS_OUTPUT
surface — by writing through to the per-thread Context’s
message buffer:
// DBMS_OUTPUT.putLine — pl_engine/.../builtin/DBMS_OUTPUT.javapublic static void putLine(String line) { Context c = getContext(); if (Context.getSystemParam(SysParam.DBMS_OUTPUT) != null && Context.getSystemParameterBool(SysParam.DBMS_OUTPUT)) { c.getMessageBuffer().enable(); } c.getMessageBuffer().putLine(line);}predefined/sp/SpLib.java is much larger — it carries
runtime exceptions (CASE_NOT_FOUND, NO_DATA_FOUND, …),
the op* operators that SymbolStack.addOperatorDecls reads
by reflection, the convertXxxToYyy coercion methods, and the
Query cursor wrapper that SYS_REFCURSOR resolves to
(Type.SYS_REFCURSOR is registered with
fullJavaType = "com.cubrid.plcsql.predefined.sp.SpLib.Query").
Compilation pipeline at CREATE PROCEDURE time
Section titled “Compilation pipeline at CREATE PROCEDURE time”When the user executes CREATE OR REPLACE PROCEDURE … LANGUAGE PLCSQL, the SQL parser flags the language and
jsp_cl.cpp takes the create-routine path. The relevant
dispatch is sp_info.lang == SP_LANG_PLCSQL (around
jsp_cl.cpp line 1132), which packs a compile_request and
hands it to plcsql_transfer_file:
// jsp_cl.cpp — pseudo, condensed from the create-procedure pathPLCSQL_COMPILE_REQUEST compile_request;PLCSQL_COMPILE_RESPONSE compile_response;// ...sp_info.lang = (SP_LANG_ENUM) PT_NODE_SP_LANG (statement);// ...if (sp_info.lang == SP_LANG_PLCSQL) { compile_request.code.assign (statement->sql_user_text, statement->sql_user_text_len); err = plcsql_transfer_file (compile_request, compile_response); }// ...if (!compile_request.code.empty ()) { assert (sp_info.lang == SP_LANG_PLCSQL); code_info.stype = SPSC_PLCSQL; // language tag for catalog // -- save the artifacts produced by the JVM sp_add_stored_procedure_code (code_info); }plcsql_transfer_file ends up at
compile_handler::compile (shown earlier), which packs the
request, writes it to the JVM connection, and then drives the
callback loop. On the JVM side, ExecuteThread.processCompile
is the matching consumer:
// ExecuteThread.processCompile — pl_engine/.../jsp/ExecuteThread.javaprivate void processCompile() throws Exception { unpacker.setBuffer(ctx.getInboundQueue().take()); readSessionParameter(unpacker); CompileRequest request = new CompileRequest(unpacker);
boolean verbose = request.mode.contains("v"); String inSource = request.code; String owner = request.owner;
CompileInfo info = null; try { info = PlcsqlCompilerMain.compilePLCSQL(inSource, owner, verbose); if (info.errCode == 0) { MemoryJavaCompiler compiler = new MemoryJavaCompiler(); SourceCode sCode = new SourceCode(info.className, info.translated);
// optional: dump translated Java to $CUBRID_TMP/icode/<className>.java if (Context.getSystemParameterBool(SysParam.STORED_PROCEDURE_DUMP_ICODE)) { Path dirPath = Paths.get(Server.getConfig().getTmpPath() + "/icode"); if (Files.notExists(dirPath)) Files.createDirectories(dirPath); Path path = dirPath.resolve(info.className + ".java"); Files.write(path, info.translated.getBytes(Context.getSessionCharset())); }
CompiledCodeSet codeSet = compiler.compile(sCode);
// package the compiled .class files into a JAR (in-memory) ByteArrayOutputStream baos = new ByteArrayOutputStream(); writeJar(codeSet, baos); byte[] data = baos.toByteArray();
info.compiledType = 1; // JAR info.compiledCode = Base64.getEncoder().encode(data); } } catch (Exception e) { info = new CompileInfo(-1, 0, 0, e.getMessage() != null ? e.getMessage() : "unknown compile error"); } finally { CUBRIDPacker packer = new CUBRIDPacker(ByteBuffer.allocate(1024)); info.pack(packer); Context.getCurrentExecuteThread().sendCommand(RequestCode.COMPILE, packer.getBuffer()); }}MemoryJavaCompiler is a thin wrapper around
javax.tools.JavaCompiler. Importantly, the compiler runs
in-process — no Runtime.exec("javac"), no temp files in the
common path:
// MemoryJavaCompiler — pl_engine/.../jsp/compiler/MemoryJavaCompiler.javapublic MemoryJavaCompiler() { compiler = ToolProvider.getSystemJavaCompiler(); if (compiler == null) { throw new IllegalStateException( "Cannot find the system Java compiler. Check that your class path includes tools.jar"); } if (PL_SERVER_PATH == null) { PL_SERVER_PATH = Server.getConfig().getVmPath() + File.separator + "pl_server.jar"; } useOptions("-encoding", Context.getSessionCharset().toString()); useOptions("-classpath", PL_SERVER_PATH);}
public synchronized CompiledCodeSet compile(SourceCode code) { DiagnosticCollector<JavaFileObject> collector = new DiagnosticCollector<>(); MemoryFileManager fileManager = new MemoryFileManager(compiler.getStandardFileManager(null, null, null)); JavaCompiler.CompilationTask task = compiler.getTask(null, fileManager, collector, options, null, Arrays.asList(code)); boolean result = task.call(); if (!result || collector.getDiagnostics().size() > 0) { // ... } return new CompiledCodeSet(code.getClassName(), fileManager.getCodeList());}The classpath is just pl_server.jar (the runtime / SpLib
classes), so the user’s procedure cannot reference arbitrary
Java packages — it can only use what SpLib and the JDBC
shim expose. SourceCode extends SimpleJavaFileObject and
the MemoryFileManager.getJavaFileForOutput returns a
CompiledCode (also a SimpleJavaFileObject) that captures
the bytes the compiler writes. The bytes are bundled into a
JAR in memory by writeJar, Base64-encoded, and stuffed into
info.compiledCode. The wire format on the way back is
compile_response:
// compile_response::pack — src/sp/pl_struct_compile.cppvoidcompile_response::pack (cubpacking::packer &serializator) const{ serializator.pack_int (err_code); if (err_code < 0) { serializator.pack_int (err_line); serializator.pack_int (err_column); serializator.pack_string (err_msg); } else { serializator.pack_string (translated_code); // the Java source we just compiled serializator.pack_string (register_stmt); // the synthesized "ALTER PROCEDURE ... AS '<sig>';" serializator.pack_string (class_name); serializator.pack_string (java_signature);
serializator.pack_int (compiled_type); // 1 = JAR, -1 = none if (compiled_type >= 0) { serializator.pack_string (compiled_code); // Base64 JAR bytes }
// dependencies on tables, serials, other SPs serializator.pack_int ((int) dependencies.size ()); for (auto &dep : dependencies) dep.pack (serializator); }}Both the translated Java source and the compiled JAR
bytes round-trip back. sp_add_stored_procedure_code (in
sp_catalog.cpp, around line 790) writes them into the
_db_stored_procedure_code system class:
// sp_add_stored_procedure_code — src/sp/sp_catalog.cppdb_make_int (&value, info.stype); // SPSC_PLCSQLerr = dbt_put_internal (obt_p, SP_CODE_ATTR_STYPE, &value);// ...db_make_varchar (&value, DB_DEFAULT_PRECISION, info.scode.data (), info.scode.length (), lang_get_client_charset (), lang_get_client_collation ());err = dbt_put_internal (obt_p, SP_CODE_ATTR_SCODE, &value); // source: translated Java text// ...db_make_int (&value, info.otype); // 1 = JARerr = dbt_put_internal (obt_p, SP_CODE_ATTR_OTYPE, &value);// ...db_make_varchar (&value, DB_DEFAULT_PRECISION, info.ocode.data (), info.ocode.length (), lang_get_client_charset (), lang_get_client_collation ());err = dbt_put_internal (obt_p, SP_CODE_ATTR_OCODE, &value); // object: Base64 JARSo the SCODE column carries the emitted Java source (not
the original PL/CSQL — the original lives in _db_stored_procedure
under target / arg_default_string and similar fields), and
the OCODE column carries the compiled JAR bytes. Both are
varchars; the JAR is held as Base64 to keep it inside CUBRID’s
DB_VARCHAR encoding rules.
Putting it together:
sequenceDiagram
autonumber
participant U as User (csql)
participant S as cub_server (C)
participant J as pl_server JVM
participant C as PlcsqlCompilerMain
participant M as MemoryJavaCompiler
U->>S: CREATE PROCEDURE foo ... LANGUAGE PLCSQL
S->>S: jsp_cl.cpp dispatch (SP_LANG_PLCSQL)
S->>J: compile_request{code, owner, mode}
J->>C: ExecuteThread.processCompile()
C->>C: ANTLR lex + parse → ParseTree
C->>C: ParseTreeConverter → AST (Unit)
C->>S: METHOD_REQUEST_GLOBAL_SEMANTICS (col types, SP sigs, serials)
S-->>C: global_semantics_response
C->>C: TypeChecker.visitUnit(unit)
C->>S: METHOD_REQUEST_SQL_SEMANTICS (per static SQL)
S-->>C: sql_semantics (rewritten query, host vars)
C->>C: JavaCodeWriter.buildCodeLines(unit) → translated Java source
C->>M: compile(SourceCode) — javax.tools.JavaCompiler
M-->>C: CompiledCodeSet (in-memory .class files)
C->>C: writeJar → byte[] → Base64
C-->>S: compile_response{translated, class_name, signature, ocode (JAR), dependencies}
S->>S: sp_add_stored_procedure_code → _db_stored_procedure_code
S->>S: sp_add_stored_procedure → _db_stored_procedure
S-->>U: NO_ERROR
Runtime — invocation reuses the JavaSP machinery
Section titled “Runtime — invocation reuses the JavaSP machinery”At call time, PL/CSQL is indistinguishable from JavaSP. The
catalog row has language SP_LANG_PLCSQL and a Java signature
like <class_name>.<method_name>(args)Ljava/lang/Integer;. The
C-side pl_executor.cpp packs the call into a
cubmethod::header{SP_CODE_INVOKE} request and the JVM
matches the signature against an already-loaded class — or
loads the JAR from _db_stored_procedure_code on first call,
caches the resulting Class<?> object in a MemoryClass
holder under a session-scoped class loader
(SessionClassLoader / SessionClassLoaderManager), and
invokes the static method. From the JVM’s point of view the
generated code is just another stored procedure; the only
PL/CSQL-specific runtime state is the per-call cache of
PreparedStatements for loop-optimised SQL (driven by
SqlUse.getSqlSerialNo()).
sequenceDiagram
autonumber
participant U as Caller (SQL: CALL foo(...))
participant S as cub_server
participant J as pl_server JVM
participant K as Loaded class (foo$<rev>)
U->>S: CALL foo(1, 'x')
S->>S: lookup _db_stored_procedure → java_signature, lang=PLCSQL
S->>J: SP_CODE_INVOKE{sig, args}
J->>J: SessionClassLoader.findClass(class_name)
alt class not loaded
J->>S: read _db_stored_procedure_code.OCODE (Base64 JAR)
S-->>J: JAR bytes
J->>J: defineClass each entry, cache in MemoryClass
end
J->>K: invoke static method via reflection
K->>K: DriverManager.getConnection("jdbc:default:connection::?autonomous_transaction=...")
K->>S: SQL via CUBRIDServerSideConnection
S-->>K: result sets / rowcount
K-->>J: return value / OUT args
J-->>S: pack result / out args
S-->>U: result
The class loader story is more involved than this picture
suggests — SessionClassLoaderGroup, ClassLoaderManager,
and ServerClassLoader cooperate to give every CUBRID session
a private namespace so concurrent re-creates of the same
procedure name never collide — but for a single PL/CSQL
routine the picture is correct.
Source Walkthrough
Section titled “Source Walkthrough”Symbols grouped by phase. Position hints are scoped to this
document’s updated: date.
Grammars (ANTLR)
Section titled “Grammars (ANTLR)”PlcLexer.g4— token definitions,STATIC_SQLlexer mode,staticSqlParenMatch/checkFirstLParenmode predicates.PlcParser.g4—sql_script,create_routine,routine_definition,block,body,statement,if_statement,loop_statement,cursor_definition,static_sql,case_statement,expression,concatenation,atom,type_spec,native_datatype,percent_type,percent_rowtype,literal,identifier.StaticSqlWithRecords.g4—stmt_w_record_values,stmt_w_record_set,record,row_set(only used when an embedded INSERT/UPDATE/REPLACE references a PL/CSQL record).
Front-end driver
Section titled “Front-end driver”PlcsqlCompilerMain.compilePLCSQL,compileInner,parse,SyntaxErrorIndicator(entry point — orchestrates the whole pipeline).PlcLexerEx(ANTLR lexer subclass — captures the originalCREATE PROCEDUREskeleton for re-use asregister_stmt).ParseTreeConverter(3,449 lines — parse tree → AST, buildsSymbolStack, queues upglobal_semantics_question/sql_semantics).ParseTreePrinter(debugging aid, dumps the parse tree tolog/PL-parse-tree.txt).
AST nodes
Section titled “AST nodes”compiler/ast/AstNode.java(root),Decl.java,Stmt.java,Expr.java,NodeList<E>,Body,ExHandler,ExName,TypeSpec,TypeSpecPercent,CondStmt,CondExpr.- Routine declarations:
DeclProgram(the unit),DeclRoutine,DeclProc,DeclFunc,DeclParam,DeclParamIn,DeclParamOut. - Variable declarations:
DeclVar,DeclConst,DeclIdTypeSpeced,DeclId,DeclLabel,DeclException,DeclCursor,DeclDynamicRecord,DeclForIter. - Statements:
StmtBlock,StmtIf,StmtCase,StmtBasicLoop,StmtWhileLoop,StmtForIterLoop,StmtForCursorLoop,StmtForStaticSqlLoop,StmtForSqlLoop,StmtAssign,StmtCursorOpen,StmtCursorFetch,StmtCursorClose,StmtOpenFor,StmtExecImme,StmtRaise,StmtRaiseAppErr,StmtReturn,StmtCommit,StmtRollback,StmtContinue,StmtExit,StmtNull,StmtLocalProcCall,StmtGlobalProcCall,StmtStaticSql. - Expressions:
ExprBinaryOp,ExprUnaryOp,ExprId,ExprField,ExprUint,ExprFloat,ExprStr,ExprDate,ExprDatetime,ExprTime,ExprTimestamp,ExprNull,ExprTrue,ExprFalse,ExprBetween,ExprIn,ExprLike,ExprCase,ExprCond,ExprCursorAttr,ExprSqlCode,ExprSqlErrm,ExprSqlRowCount,ExprSerialVal,ExprAutoParam,ExprGlobalFuncCall,ExprLocalFuncCall,ExprBuiltinFuncCall,ExprSyntaxedCallCast,ExprSyntaxedCallChr,ExprSyntaxedCallAdddate,ExprSyntaxedCallSubdate,ExprSyntaxedCallExtract,ExprSyntaxedCallPosition,ExprSyntaxedCallTrim. - Loop optimisation:
ast/loopOpt/LoopOptimizable,LocalRoutineCall,SqlUse.
Visitors
Section titled “Visitors”visitor/AstVisitor.java(abstract base, declaresvisitXxxfor every node type).visitor/TypeChecker.java(semantic phase — walks AST, picks operator overloads, insertsCoercion, populatesdependenciesandsqlUsesInRecursiveCalls).visitor/JavaCodeWriter.java(emitter —tmplUnit,tmplGetConn,tmplStmtIfWithoutElse,tmplStmtIfWithElse,tmplStmtLocalProcCall,tmplDeclBlock,tmplStmtForCursorLoop,tmplStmtForStaticSqlLoop, plus avisitXxxfor every AST node).
Symbol table and types
Section titled “Symbol table and types”SymbolStack.java,SymbolTable,Scope.java,Misc.java,InstanceStore.java,addOperatorDecls,addDbmsOutputProcedures,LEVEL_PREDEFINED,LEVEL_MAIN,noParenBuiltInFunc.type/Type.java(the lattice:BOOLEAN,STRING_ANY,INT,BIGINT,NUMERIC_ANY,FLOAT,DOUBLE,DATE,TIME,DATETIME,TIMESTAMP,SYS_REFCURSOR, internalCURSOR,NULL,RECORD_ANY,OBJECT).type/TypeChar.java,TypeVarchar.java,TypeNumeric.java,TypeRecord.java,TypeVariadic.java.Coercion.java,CoercionScheme.java,DBTypeAdapter.java(PL/CSQL ↔ DB_VALUE bridges),StaticSql.java.annotation/Operator(the marker onSpLib.opXxxstatic methods thatSymbolStack.addOperatorDeclsreads by reflection).
Server API (compile-time RPC)
Section titled “Server API (compile-time RPC)”serverapi/ServerAPI.java— abstract surface used by the compiler to ask the server semantic questions (getColumnInfo,getProcSignature,isSerial,getSqlSemantics).serverapi/SqlSemantics.java,PlParamInfo.java,ServerConstants.java.
Builtins / predefined runtime
Section titled “Builtins / predefined runtime”builtin/DBMS_OUTPUT.java,MessageBuffer.java.predefined/sp/SpLib.java(the main runtime — operators, exceptions, coercions,Querycursor wrapper) — referenced by every emitted class viaimport static com.cubrid.plcsql.predefined.sp.SpLib.*;.predefined/PlcsqlRuntimeError.java(parent ofCASE_NOT_FOUND,STORAGE_ERROR,PROGRAM_ERROR, …).
In-process Java compilation
Section titled “In-process Java compilation”jsp/compiler/MemoryJavaCompiler.java(wrapsToolProvider.getSystemJavaCompiler()),jsp/compiler/MemoryFileManager.java,jsp/code/SourceCode.java,CompiledCode.java,CompiledCodeSet.java,MemoryClass.java,ClassAccess.java,Signature.java,MemoryClassCache.java.- Class loaders:
jsp/classloader/SessionClassLoader.java,SessionClassLoaderGroup.java,SessionClassLoaderManager.java,ServerClassLoader.java,ClassLoaderManager.java.
C-side glue
Section titled “C-side glue”src/sp/jsp_cl.cpp—PT_NODE_SP_LANG,SP_LANG_PLCSQLdispatch in the create-procedure path,plcsql_transfer_file.src/sp/pl_compile_handler.cpp—compile_handlerconstructor / destructor,compile_handler::compile(drives the callback loop),compile_handler::read_request,compile_handler::create_error_response.src/sp/pl_struct_compile.cpp—compile_request::pack,compile_response::pack,plcsql_dependency,sql_semantics_request,sql_semantics_response,pl_parameter_info,global_semantics_question,global_semantics_request,global_semantics_response_common,global_semantics_response_udpf,global_semantics_response_serial,global_semantics_response_column.src/sp/sp_catalog.cpp—sp_add_stored_procedure_code,sp_edit_stored_procedure_code,SP_CODE_INFO,SP_CODE_CLASS_NAME,SP_CODE_ATTR_STYPE,SP_CODE_ATTR_SCODE,SP_CODE_ATTR_OTYPE,SP_CODE_ATTR_OCODE,SP_CODE_ATTR_NAME,SP_CODE_ATTR_CREATED_TIME,SP_ATTR_OWNER.src/sp/sp_code.cpp— companion code-management helpers used at invocation time.
Position hints (as observed when this doc was last updated:)
Section titled “Position hints (as observed when this doc was last updated:)”| Symbol | File | Line |
|---|---|---|
WITH / STATIC_SQL mode entry | pl_engine/.../antlr/PlcLexer.g4 | 36 |
SS_SEMICOLON mode-exit action | pl_engine/.../antlr/PlcLexer.g4 | 268 |
sql_script | pl_engine/.../antlr/PlcParser.g4 | 33 |
create_routine | pl_engine/.../antlr/PlcParser.g4 | 37 |
routine_definition | pl_engine/.../antlr/PlcParser.g4 | 41 |
block / body | pl_engine/.../antlr/PlcParser.g4 | 236 |
if_statement | pl_engine/.../antlr/PlcParser.g4 | 171 |
loop_statement (5 forms) | pl_engine/.../antlr/PlcParser.g4 | 183 |
cursor_definition | pl_engine/.../antlr/PlcParser.g4 | 95 |
stmt_w_record_values | pl_engine/.../antlr/StaticSqlWithRecords.g4 | 91 |
PlcsqlCompilerMain.compilePLCSQL | pl_engine/.../compiler/PlcsqlCompilerMain.java | 55 |
PlcsqlCompilerMain.compileInner | pl_engine/.../compiler/PlcsqlCompilerMain.java | 171 |
SymbolStack.addOperatorDecls | pl_engine/.../compiler/SymbolStack.java | 75 |
Type.IDX_* constants | pl_engine/.../compiler/type/Type.java | 82 |
JavaCodeWriter.tmplUnit | pl_engine/.../compiler/visitor/JavaCodeWriter.java | 125 |
JavaCodeWriter.tmplGetConn | pl_engine/.../compiler/visitor/JavaCodeWriter.java | 108 |
JavaCodeWriter.visitStmtIf | pl_engine/.../compiler/visitor/JavaCodeWriter.java | 2384 |
JavaCodeWriter.tmplStmtLocalProcCall | pl_engine/.../compiler/visitor/JavaCodeWriter.java | 2416 |
StmtIf (AST class) | pl_engine/.../compiler/ast/StmtIf.java | 36 |
SqlUse (loop-opt interface) | pl_engine/.../compiler/ast/loopOpt/SqlUse.java | 33 |
MemoryJavaCompiler.compile | pl_engine/.../jsp/compiler/MemoryJavaCompiler.java | 74 |
ExecuteThread.processCompile | pl_engine/.../jsp/ExecuteThread.java | 395 |
CompileInfo.pack | pl_engine/.../jsp/data/CompileInfo.java | 73 |
compile_handler::compile (callback loop) | src/sp/pl_compile_handler.cpp | 92 |
compile_response::pack | src/sp/pl_struct_compile.cpp | 78 |
global_semantics_request::pack | src/sp/pl_struct_compile.cpp | 558 |
sp_add_stored_procedure_code | src/sp/sp_catalog.cpp | 790 |
sp_edit_stored_procedure_code | src/sp/sp_catalog.cpp | 907 |
DBMS_OUTPUT.putLine | pl_engine/.../plcsql/builtin/DBMS_OUTPUT.java | 114 |
Cross-check Notes
Section titled “Cross-check Notes”The task description framed the artifact-storage question as
“is the generated artifact stored as Java source or compiled
.class bytes?”. The honest answer is both. Reading
pl_struct_compile.cpp and sp_catalog.cpp together:
compile_response::packalways emitstranslated_code(the Java source string) for any successful compile, and the C side stores it in_db_stored_procedure_code.SCODEviasp_add_stored_procedure_code. So the Java source CUBRID emitted is always retrievable from the catalog (which is whatSHOW CREATE PROCEDUREshows for PL/CSQL — along with the original PL/CSQL text on the parent_db_stored_procedurerow).compiled_typeis1whenever the JVM successfully ranMemoryJavaCompiler— meaning a compiled JAR is also shipped back, Base64-encoded intocompiled_code, and stored in_db_stored_procedure_code.OCODEwithOTYPE=1.compiled_type == -1is reserved for the future case where the JVM emits source but skips compilation (e.g. on a later-bound compile, or ifjavacis unavailable). ReadingExecuteThread.processCompileshows the only path that clearscompiledTypeis the exception path, which setserrCode = -1and skips the success packing entirely. So in practice every successful PL/CSQL CREATE produces both source and JAR.
Two further verification points:
- The path written by
Files.write(path, info.translated.getBytes(...))under$CUBRID_TMP/icode/<className>.javais only a debug dump, gated by the system parameterSTORED_PROCEDURE_DUMP_ICODE. It is not a primary storage location — the primary storage is the catalog row. - The Java class is loaded from the catalog bytes, not
from
$CUBRID_TMP/icode.SessionClassLoaderreads_db_stored_procedure_code.OCODEdirectly when the class is first referenced, not the sidecar.javadump. The debug dump exists only so the operator can inspect the emitted Java when something goes wrong in production.
A second cross-check, this time about the parser entry point:
the grammar lists sql_script : create_routine EOF, which
implies PL/CSQL only ever compiles a single
CREATE PROCEDURE / CREATE FUNCTION per call. There is no
support for compiling a free-standing anonymous block from the
top level. That matches the ExecuteThread.processCompile
behaviour, where the request always carries one code string
plus an owner and a mode.
Open Questions
Section titled “Open Questions”- Recompile-on-dependency-change. The compiled artifact
carries a
dependenciesset (table names, serial names, global-SP names). What is the policy when a referenced table is dropped or its column types change? Does the catalog mark the procedure invalid, force a recompile on next call, or fail the call? The mechanism for tracking invalidation lives somewhere insp_code.cpp/sp_catalog.cpp, but the policy was not visible in the files read here. - Versioning across upgrades.
MemoryJavaCompileruses whatevertools.jarships with the JDK that runspl_server. After a JDK upgrade, do existing JARs stored in_db_stored_procedure_code.OCODEstill load? They should — Java class-file backward compatibility is strong — but a forward-compat upgrade (older JDK, newer class file version) would fail. SqlUse.usingRefsemantics. The interface hassetToUseRef/usingRef, suggesting some sites get rewritten to use a reference handle instead of a fresh prepared statement. The exact criterion was not visible from the loopOpt package alone — it lives inTypeCheckerorJavaCodeWriter’s loop-bodies path.- Error-line backmapping under aggressive optimiser. The
codeRangeMarkerListmachinery inJavaCodeWriterrecords(plcLine, plcCol)for each emitted Java span. If the user’s PL/CSQL is preprocessed (e.g., aCASEexpression compiled toStmtIf), is the recorded line the outerCASEkeyword or the synthesizedif? The visitor usesMisc.getLineColumnOf(node.ctx)for the synthetic node, which inherits the originalctxonly if the conversion inParseTreeConverterreused it; this is worth verifying for diagnosability. - Concurrent recompile. Two sessions issuing
CREATE OR REPLACE PROCEDURE foosimultaneously: the catalog write is serialised under the standard MVCC + lock rules, but the JVM-side class-name uses therevisioncounter, which is static onPlcsqlCompilerMain(private static int revision = 1;) — so revisions are per-JVM-process, not per-DB. After apl_serverrestart the counter resets. This is fine for class-loader identity but might confuse debug dumps that name<class>$<rev>.javain$CUBRID_TMP/icode/.
Sources
Section titled “Sources”PL/CSQL compiler (Java side):
pl_engine/pl_server/src/main/antlr/PlcLexer.g4pl_engine/pl_server/src/main/antlr/PlcParser.g4pl_engine/pl_server/src/main/antlr/StaticSqlWithRecords.g4pl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/PlcsqlCompilerMain.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/ParseTreeConverter.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/SymbolStack.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/Coercion.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/CoercionScheme.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/StaticSql.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/InstanceStore.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/Misc.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/PlcLexerEx.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/visitor/AstVisitor.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/visitor/TypeChecker.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/visitor/JavaCodeWriter.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/ast/*.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/ast/loopOpt/*.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/type/*.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/serverapi/*.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/compiler/annotation/*.java
Runtime / built-ins (Java side):
pl_engine/pl_server/src/main/java/com/cubrid/plcsql/builtin/DBMS_OUTPUT.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/builtin/MessageBuffer.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/predefined/sp/SpLib.javapl_engine/pl_server/src/main/java/com/cubrid/plcsql/predefined/PlcsqlRuntimeError.java
In-process Java compilation:
pl_engine/pl_server/src/main/java/com/cubrid/jsp/compiler/MemoryJavaCompiler.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/compiler/MemoryFileManager.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/code/SourceCode.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/code/CompiledCode.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/code/CompiledCodeSet.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/code/MemoryClass.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/code/MemoryClassCache.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/ExecuteThread.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/data/CompileInfo.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/classloader/SessionClassLoader.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/classloader/SessionClassLoaderGroup.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/classloader/SessionClassLoaderManager.javapl_engine/pl_server/src/main/java/com/cubrid/jsp/classloader/ClassLoaderManager.java
C-side glue:
src/sp/AGENTS.mdsrc/sp/jsp_cl.cppsrc/sp/pl_compile_handler.cppsrc/sp/pl_compile_handler.hppsrc/sp/pl_struct_compile.cppsrc/sp/pl_struct_compile.hppsrc/sp/sp_catalog.cppsrc/sp/sp_code.cpp
Architectural overview:
pl_engine/AGENTS.md