Ask Questions and Find Answers
Important:
Ask is now read-only. You can review any existing questions and answers, but not add anything new.
But - don't panic! While ask is no more, we've replaced it with discuss - the new Liferay Discussion Forum! Read more here here or just visit the site here:
discuss.liferay.com
RE: database access returning duplicate data
I am accessing a database of user information. I am using service builder to create the data access layer. It works most of the time but I have had the following problem. I have a table that has username and group as columns (among other things). Users may belong to multiple groups. When I search for all rows with a particular user name I get the correct number of responses but all of the responses are the same. For example if I query for username "smith" and smith is a member of 3 groups, "group1", "group2", "group3". I get 3 rows back but they are all duplicates of the first row. That is I get "smith", "group1" returned 3 times. Using an sql command tool to make the selection returns the expected result of 3 unique rows. Has anyone else had this sort of problem?
I am running liferay plugins sdk 6.06 for eclipse.
Best,
Dave Semeraro
I am running liferay plugins sdk 6.06 for eclipse.
Best,
Dave Semeraro
If you enable hibernate query logging, you should be able to see the SQL that is being issued. Most likely you've got some sort of code issue, but seeing the query being used will be the key to solving it...
Thanks,
How do I enable hibernate query logging?
Dave
How do I enable hibernate query logging?
Dave
You can add the following line to your portal-ext.properties:
After it you should restart your portal.
hibernate.show_sql=true
After it you should restart your portal.
You could also add the "DISTINCT" keyword to your sql query (if you are using Custom SQL Finder) like this:
SELECT DISTINCT objectName.
I think the best way however is to first take a look at the query being generated by hibernate. If it's same as the query you ran manually on the database - then it should deliver the same results
Hope this helps.
SELECT DISTINCT objectName.
I think the best way however is to first take a look at the query being generated by hibernate. If it's same as the query you ran manually on the database - then it should deliver the same results
Hope this helps.
Thanks,
I turned on hibernate logging. The logs show the following query for the particular search I am doing.
Hibernate: select acctvimpl0_.user_id as user1_866_, acctvimpl0_.psn as psn866_, acctvimpl0_.login as login866_, acctvimpl0_.last_name as last4_866_, acctvimpl0_.first_name as first5_866_, acctvimpl0_.address1 as address6_866_, acctvimpl0_.address2 as address7_866_, acctvimpl0_.city as city866_, acctvimpl0_.state as state866_, acctvimpl0_.zip_code as zip10_866_, acctvimpl0_.country as country866_, acctvimpl0_.email as email866_, acctvimpl0_.phone as phone866_, acctvimpl0_.extension as extension866_, acctvimpl0_.organization as organiz15_866_, acctvimpl0_.is_pi as is16_866_, acctvimpl0_.is_am as is17_866_, acctvimpl0_.is_active as is18_866_, acctvimpl0_.is_vetted as is19_866_, acctvimpl0_.is_valid as is20_866_, acctvimpl0_.pi_confirmation as pi21_866_, acctvimpl0_.pi_notified as pi22_866_, acctvimpl0_.requested_email as requested23_866_, acctvimpl0_.status as status866_, acctvimpl0_.other_state as other25_866_ from acctv acctvimpl0_ where (acctvimpl0_.is_pi=? )
All of the queries in the log have a where clause similar to the one above. The actual where clause should be where is_pi = 1. I dont know why there is a question mark there. The query returns a list of projects and users for which the users are primary investigators ( is_pi = 1) but in the case where an investigator is PI of two or more projects, the query returns two or more results but the results are the same. For example, if Smith is PI on project alpha and project beta I would get two results that both listed project alpha.
User Project Other info......
Smith alpha blather
Smith alpha blather <------ expecting this to be " Smith beta other blather "
Is the above hibernate query correct or am I missing something?
Thanks for the help by the way. I am connecting to an existing database and having some issues. Usually this stuff works fine when I let Liferay build the tables.
Dave
I turned on hibernate logging. The logs show the following query for the particular search I am doing.
Hibernate: select acctvimpl0_.user_id as user1_866_, acctvimpl0_.psn as psn866_, acctvimpl0_.login as login866_, acctvimpl0_.last_name as last4_866_, acctvimpl0_.first_name as first5_866_, acctvimpl0_.address1 as address6_866_, acctvimpl0_.address2 as address7_866_, acctvimpl0_.city as city866_, acctvimpl0_.state as state866_, acctvimpl0_.zip_code as zip10_866_, acctvimpl0_.country as country866_, acctvimpl0_.email as email866_, acctvimpl0_.phone as phone866_, acctvimpl0_.extension as extension866_, acctvimpl0_.organization as organiz15_866_, acctvimpl0_.is_pi as is16_866_, acctvimpl0_.is_am as is17_866_, acctvimpl0_.is_active as is18_866_, acctvimpl0_.is_vetted as is19_866_, acctvimpl0_.is_valid as is20_866_, acctvimpl0_.pi_confirmation as pi21_866_, acctvimpl0_.pi_notified as pi22_866_, acctvimpl0_.requested_email as requested23_866_, acctvimpl0_.status as status866_, acctvimpl0_.other_state as other25_866_ from acctv acctvimpl0_ where (acctvimpl0_.is_pi=? )
All of the queries in the log have a where clause similar to the one above. The actual where clause should be where is_pi = 1. I dont know why there is a question mark there. The query returns a list of projects and users for which the users are primary investigators ( is_pi = 1) but in the case where an investigator is PI of two or more projects, the query returns two or more results but the results are the same. For example, if Smith is PI on project alpha and project beta I would get two results that both listed project alpha.
User Project Other info......
Smith alpha blather
Smith alpha blather <------ expecting this to be " Smith beta other blather "
Is the above hibernate query correct or am I missing something?
Thanks for the help by the way. I am connecting to an existing database and having some issues. Usually this stuff works fine when I let Liferay build the tables.
Dave
Dave Semeraro:
where (acctvimpl0_.is_pi=? )
All of the queries in the log have a where clause similar to the one above. The actual where clause should be where is_pi = 1. I dont know why there is a question mark there.
Hibernate uses precompiled queries and will substitute the ? with the query value you're using.
So the true test is whether in your sql tool you get the same dups when issuing that query (substituting the ? for 1 in your tool).
The query I use in the (toad) database access tool looks like this:
SELECT acctv.psn,
acctv.[login],
acctv.[user_id],
acctv.last_name,
acctv.first_name,
acctv.address1,
acctv.address2,
acctv.city,
acctv.state,
acctv.zip_code,
acctv.country,
acctv.email,
acctv.phone,
acctv.extension,
acctv.organization,
acctv.is_pi,
acctv.is_am,
acctv.is_active,
acctv.is_vetted,
acctv.is_valid,
acctv.pi_confirmation,
acctv.pi_notified,
acctv.requested_email,
acctv.status,
acctv.other_state
FROM bwportletdb.dbo.acctv acctv
WHERE (acctv.is_pi = 1)
This tool gives the correct result of 97 rows. Each row corresponds to a primary investigator. Each row has a unique user/project pairing.
The service builder query looks different in that there are "as" elements there that are not present in the toad query. But I would not think that would change the result.
Service builder also returns 97 rows of the table but instances where a user is PI of two different projects show up as duplicates. The first returned project for which a user is PI is repeated as many times as the person is PI of a project. If that makes sense.
dave
SELECT acctv.psn,
acctv.[login],
acctv.[user_id],
acctv.last_name,
acctv.first_name,
acctv.address1,
acctv.address2,
acctv.city,
acctv.state,
acctv.zip_code,
acctv.country,
acctv.email,
acctv.phone,
acctv.extension,
acctv.organization,
acctv.is_pi,
acctv.is_am,
acctv.is_active,
acctv.is_vetted,
acctv.is_valid,
acctv.pi_confirmation,
acctv.pi_notified,
acctv.requested_email,
acctv.status,
acctv.other_state
FROM bwportletdb.dbo.acctv acctv
WHERE (acctv.is_pi = 1)
This tool gives the correct result of 97 rows. Each row corresponds to a primary investigator. Each row has a unique user/project pairing.
The service builder query looks different in that there are "as" elements there that are not present in the toad query. But I would not think that would change the result.
Service builder also returns 97 rows of the table but instances where a user is PI of two different projects show up as duplicates. The first returned project for which a user is PI is repeated as many times as the person is PI of a project. If that makes sense.
dave
I have some more information. I tried grabbing all of the records in the database table acctv. I used the following code to do that.
int numberofaccts = acctvLocalServiceUtil.getacctvsCount();
List<acctv> biglist = acctvLocalServiceUtil.getacctvs(0, numberofaccts-1);
I then search the "biglist" for instances where is_pi = 1 and write out the project serial number and user last name like this.
for(int j=0;j<biglist.size();j++) {
if(biglist.get(j).getIs_pi() == 1)
System.out.println(biglist.get(j).getPsn() + " " + biglist.get(j).getLast_name());
}
The result was that for each user that was PI on multiple projects the data for that user was duplicated.
Another thing I noticed was that the database table has no primary key. It is actually a Sybase view in which there is no column with completely distinct and unique entries. I suspect this might be part of the problem. I indicated an arbitrary row (user_id) I believe to service builder as the primary key ( service builder wont build if you leave the primary key for an entity out). Since there are multiple entries in the table for users that are members of multiple projects or that are PI on multiple project the user_id column has repeated entries and is not unique.
dave
int numberofaccts = acctvLocalServiceUtil.getacctvsCount();
List<acctv> biglist = acctvLocalServiceUtil.getacctvs(0, numberofaccts-1);
I then search the "biglist" for instances where is_pi = 1 and write out the project serial number and user last name like this.
for(int j=0;j<biglist.size();j++) {
if(biglist.get(j).getIs_pi() == 1)
System.out.println(biglist.get(j).getPsn() + " " + biglist.get(j).getLast_name());
}
The result was that for each user that was PI on multiple projects the data for that user was duplicated.
Another thing I noticed was that the database table has no primary key. It is actually a Sybase view in which there is no column with completely distinct and unique entries. I suspect this might be part of the problem. I indicated an arbitrary row (user_id) I believe to service builder as the primary key ( service builder wont build if you leave the primary key for an entity out). Since there are multiple entries in the table for users that are members of multiple projects or that are PI on multiple project the user_id column has repeated entries and is not unique.
dave
One more shot at this. I was able to make one problem ( I wont call it a bug) go away at the cost of another. I am able to address the issue of no primary key by specifying two columns as primary (a composite) in the service.xml for the particular entity. The combination of the two keys ( loginid and projectname) is unique to each row. Users are not part of the same project more than once. This gives me the desired behavior when I search for PIs.
However if I try to get a count on the total number of entries in the table I get the following error:
org.hibernate.QueryException: path expression ends in a composite value: acctvimpl0_ [SELECT COUNT(acctv) FROM edu.illinois.ncsa.allocations.model.impl.acctvImpl acctv]
The line of code that generated that error was:
int numberofaccts = acctvLocalServiceUtil.getacctvsCount();
I imagine I may need to supply service builder with more data but am uncertain what.
The count call above was debugging code but it makes me nervous about not having specified the correct settings in service.xml. The only change I made was to select an additional column as a primary key in one entity. Is more required?
thanks to all,
Dave
However if I try to get a count on the total number of entries in the table I get the following error:
org.hibernate.QueryException: path expression ends in a composite value: acctvimpl0_ [SELECT COUNT(acctv) FROM edu.illinois.ncsa.allocations.model.impl.acctvImpl acctv]
The line of code that generated that error was:
int numberofaccts = acctvLocalServiceUtil.getacctvsCount();
I imagine I may need to supply service builder with more data but am uncertain what.
The count call above was debugging code but it makes me nervous about not having specified the correct settings in service.xml. The only change I made was to select an additional column as a primary key in one entity. Is more required?
thanks to all,
Dave
can you copy and paste the relevant part of your service.xml here?
Here is the section of the service.xml file that corresponds to the database table I am accessing.
<entity name="acctv" local-service="true" remote-service="false" table="acctv" cache-enabled="false" data-source="acctvDataSource" session-factory="acctvSessionFactory" tx-manager="acctvTransactionManager">
<column name="psn" type="String" primary="true" db-name="psn"></column>
<column name="login" type="String"></column>
<column name="user_id" type="int" primary="true" db-name="user_id"></column>
<column name="last_name" type="String"></column>
<column name="first_name" type="String"></column>
<column name="address1" type="String"></column>
<column name="address2" type="String"></column>
<column name="city" type="String"></column>
<column name="state" type="String"></column>
<column name="zip_code" type="String"></column>
<column name="country" type="String"></column>
<column name="email" type="String"></column>
<column name="phone" type="String"></column>
<column name="extension" type="String"></column>
<column name="organization" type="String"></column>
<column name="is_pi" type="int"></column>
<column name="is_am" type="int"></column>
<column name="is_active" type="int"></column>
<column name="is_vetted" type="int"></column>
<column name="is_valid" type="int"></column>
<column name="pi_confirmation" type="String"></column>
<column name="pi_notified" type="int"></column>
<column name="requested_email" type="int"></column>
<column name="status" type="String"></column>
<column name="other_state" type="String"></column>
<order by="asc">
<order-column name="last_name" order-by="asc"></order-column>
</order>
<finder name="Psn" return-type="Collection">
<finder-column name="psn"></finder-column>
</finder>
<finder name="PI" return-type="Collection">
<finder-column name="is_pi"></finder-column>
</finder>
</entity>
dave
<entity name="acctv" local-service="true" remote-service="false" table="acctv" cache-enabled="false" data-source="acctvDataSource" session-factory="acctvSessionFactory" tx-manager="acctvTransactionManager">
<column name="psn" type="String" primary="true" db-name="psn"></column>
<column name="login" type="String"></column>
<column name="user_id" type="int" primary="true" db-name="user_id"></column>
<column name="last_name" type="String"></column>
<column name="first_name" type="String"></column>
<column name="address1" type="String"></column>
<column name="address2" type="String"></column>
<column name="city" type="String"></column>
<column name="state" type="String"></column>
<column name="zip_code" type="String"></column>
<column name="country" type="String"></column>
<column name="email" type="String"></column>
<column name="phone" type="String"></column>
<column name="extension" type="String"></column>
<column name="organization" type="String"></column>
<column name="is_pi" type="int"></column>
<column name="is_am" type="int"></column>
<column name="is_active" type="int"></column>
<column name="is_vetted" type="int"></column>
<column name="is_valid" type="int"></column>
<column name="pi_confirmation" type="String"></column>
<column name="pi_notified" type="int"></column>
<column name="requested_email" type="int"></column>
<column name="status" type="String"></column>
<column name="other_state" type="String"></column>
<order by="asc">
<order-column name="last_name" order-by="asc"></order-column>
</order>
<finder name="Psn" return-type="Collection">
<finder-column name="psn"></finder-column>
</finder>
<finder name="PI" return-type="Collection">
<finder-column name="is_pi"></finder-column>
</finder>
</entity>
dave
Dave, so I know in your OP that you're using plugins SDK 6.0.6, what about the Portal, what version?
I did some research on the hibernate issue you're seeing and found this bug on composite keys: https://issues.liferay.com/browse/LPS-12002
I did some research on the hibernate issue you're seeing and found this bug on composite keys: https://issues.liferay.com/browse/LPS-12002
Wilson Man:
Dave, so I know in your OP that you're using plugins SDK 6.0.6, what about the Portal, what version?
I did some research on the hibernate issue you're seeing and found this bug on composite keys: https://issues.liferay.com/browse/LPS-12002
Using composite keys is skirting the issue. It is only a "workaround" not a real solution. And b/c of such workarounds, the accurate count of data cannot be retrieved.
One fix I found so far is to avoid using the generated queries that service builder gives me and just create my own custom query. When I use the same query that SB generated inside of a custom query, then I was able to get all of my data. It's painful to have to go this route but it's the only possible solution that I can find.
I would love for someone to point me wrong and to show me that the generated SB queries can deal with non-unique primary keys.
I would love for someone to point me wrong and to show me that the generated SB queries can deal with non-unique primary keys.
David H Nebinger:
If you enable hibernate query logging, you should be able to see the SQL that is being issued. Most likely you've got some sort of code issue, but seeing the query being used will be the key to solving it...
hi David. You did not answer the question. Instead, you just went off on a tangent. If we use service builder, all of the queries are generated by SB. And they are hidden from us. These are facts. But the problem only exists when the primary key is not unique. So the original problem is why can't SB figure out how to return all of the data instead of giving duplicate data if the primary keys are not unique.
The solution was given to use a composite primary key where two columns are used in order to distinguish the data. But that shouldn't have to be the case. And yet that is the only solution. Why is that?
David Lee:
David H Nebinger:If you enable hibernate query logging, you should be able to see the SQL that is being issued. Most likely you've got some sort of code issue, but seeing the query being used will be the key to solving it...
hi David. You did not answer the question. Instead, you just went off on a tangent. If we use service builder, all of the queries are generated by SB. And they are hidden from us. These are facts. But the problem only exists when the primary key is not unique. So the original problem is why can't SB figure out how to return all of the data instead of giving duplicate data if the primary keys are not unique.
The solution was given to use a composite primary key where two columns are used in order to distinguish the data. But that shouldn't have to be the case. And yet that is the only solution. Why is that?
Hmm, sorry if it seemed like a tangent. Often times the only way to figure out why a query is returning data is to look at the actual SQL and try it manually. The query is hidden, but by enabling the query logging you can see the query that is being issued.
The question about the composite key vs a single key, well a "primary key" is a key which identifies a unique row. If you need a composite key to identify a unique row, then that's what is necessary and you can't get around it.
If you declare a primary key for an entity as a part of a composite key, then you will always get duplicate records. And if you're joining to other entities/tables, those too will get duplicated. The primary key must define a unique record, that's the core definition of a primary key.
Many times I prefer to use a surrogate key for records as the primary key. A surrogate would be an integer column auto increment but has nothing to do with the natural key of the table. Your natural key may be the composite key, but the surrogate key is not tied to the data and therefore doesn't have to be a composite, but this may not be an option when using legacy tables.
Thanks for the reply David. So is there a way to use what SB generates or is using custom queries the best route? In the case of searching through a table that hold employees and the groups that they belong in but with no natural, surrogate, or unique primary key, what is the best option?
I don't think I'm following your entities...
Is it something like:
And you're asking about querying the emp-group table?
Is it something like:
+----------+ +----------+
| employee | | emp-group| +----------+
+----------+ +----------+ | group |
| emp_id | <------- | emp_id | +----------+
+----------+ | group_id | ------->> | group_id |
+----------+ | group |
+----------+
And you're asking about querying the emp-group table?
yes bingo!
Although at times, my table is actually a view and the view may not contain primary ids. IN such cases, it is really just employee name and then group name as opposed to employee id and group id that matches with their corresponding tables. And thus in such cases, there isn't any references back to the original employee tables or group tables.
But either way, the table emp-group would have a lot of rows that have duplicate employee ids
Sorry for the late reply by the way...
Although at times, my table is actually a view and the view may not contain primary ids. IN such cases, it is really just employee name and then group name as opposed to employee id and group id that matches with their corresponding tables. And thus in such cases, there isn't any references back to the original employee tables or group tables.
But either way, the table emp-group would have a lot of rows that have duplicate employee ids
+---------------+
| employee name |
+---------------+
| group name |
+---------------+
Sorry for the late reply by the way...
Okay, in that case you'll always be getting duplicate rows.
Your basic query, "select * from emp_group where emp='dnebing'" will have a separate row for each group that I'm in.
When you are doing a bunch, you can do something like "select * from emp_group order by emp,group" to get them in order so they're easier to process.
But at the end of the day, you still have the dups.
Personally, I like to hide these kinds of details in the service implementation. So for the first one, for example, I'd have a method in EmpGroupLocalServiceImpl:
public List<String> getEmployeeGroups(final String emp) {}
I'd sweep through the records returned, extract the group names and then return them as a simple list.
For the second case, I'd probably do:
public Map<String, List<String>> getAllEmployeeGroups() {}
In this method I'd process all of the records to populate the map into a usable form.
You know you're going to get duplicate records in cases like these. The best that you can do is deal with them, but do it in the service layer so you don't have to worry about them in your portlet code. The fun part, of course, is dealing with list changes, for example. If you can't do a simple "public void deleteEmpGroup(final String emp, final String group)" to delete individually and have to deal with "public void updateEmpGroups(final String emp, final List<String> groups)" sort of thing, then you end up processing the two lists to handle deletes and inserts. Not a lot of fun, but still doable, and better than having the portlet code trying to figure it all out...
Your basic query, "select * from emp_group where emp='dnebing'" will have a separate row for each group that I'm in.
When you are doing a bunch, you can do something like "select * from emp_group order by emp,group" to get them in order so they're easier to process.
But at the end of the day, you still have the dups.
Personally, I like to hide these kinds of details in the service implementation. So for the first one, for example, I'd have a method in EmpGroupLocalServiceImpl:
public List<String> getEmployeeGroups(final String emp) {}
I'd sweep through the records returned, extract the group names and then return them as a simple list.
For the second case, I'd probably do:
public Map<String, List<String>> getAllEmployeeGroups() {}
In this method I'd process all of the records to populate the map into a usable form.
You know you're going to get duplicate records in cases like these. The best that you can do is deal with them, but do it in the service layer so you don't have to worry about them in your portlet code. The fun part, of course, is dealing with list changes, for example. If you can't do a simple "public void deleteEmpGroup(final String emp, final String group)" to delete individually and have to deal with "public void updateEmpGroups(final String emp, final List<String> groups)" sort of thing, then you end up processing the two lists to handle deletes and inserts. Not a lot of fun, but still doable, and better than having the portlet code trying to figure it all out...
Does your problem solved?
Thank you David for your expert advise. I am very appreciative of your efforts.
For me so far, I have solved the dups using composite keys. And when that doesn't work, I end up using custom queries and going through FinderImpl in the service.persistence layer. IT's not so bad so I don't mind using it. But I was just curious if there was a simpler way and if SB can handle it.
I would recommend doing custom queries and going through FinderImpls for people who have dup issues with their data returns.
For me so far, I have solved the dups using composite keys. And when that doesn't work, I end up using custom queries and going through FinderImpl in the service.persistence layer. IT's not so bad so I don't mind using it. But I was just curious if there was a simpler way and if SB can handle it.
I would recommend doing custom queries and going through FinderImpls for people who have dup issues with their data returns.
I'm currently having the same issue with a DB View. Has anyone had this same issue with a DB View?
Copyright © 2025 Liferay, Inc
• Privacy Policy
Powered by Liferay™